💾 Archived View for dioskouroi.xyz › thread › 30447705 captured on 2022-03-01 at 15:20:51. Gemini links have been rewritten to link to archived content

View Raw

More Information

-=-=-=-=-=-=-

Ledit: Simple, GPU-rendered, no bullshit text editor

Author: todsacerdoti

Score: 128

Comments: 76

Date: 2022-02-23 22:51:12

Web Link

________________________________________________________________________________

kazinator wrote at 2022-02-24 01:16:25:

But editing is perfectly pleasant over a serial line. Even at speeds as slow as 9600 bps --- if the screen updates are optimized, like with curses.

I'm editing a file on an embedded system in a PuTTY serial window (115.2 kbps). I'm in a remote desktop connection over a VPN to get to that PuTTY window. (1920x1200 resolution). Everything is responsive; no complaints.

(I'm using BusyBox vi. OMG, it has poor optimization of refresh. I just switched to 9600 bps for fun and opened a small file in the editor. So most of the lines are filled with the tilde character, characteristic of Vi. When I type Ctrl-R to refresh the screen, BusyBox is repainting those ~ lines in their entirety. It's printing all the spaces! It has no knowledge of the "clear to end of line" or "clear from here to bottom of screen" escape sequences. It also scrolls the buffer by repainting, even single line scrolls when you move the cursor down from the bottom line or up from the top line.

At 115200 bps, you don't notice any of this.)

cout wrote at 2022-02-24 06:13:33:

9600bps is 1200cps which is plenty for remote echo to keep up with even a fast typist; I think that has a lot to do with why it feels snappy.

Some web pages can't keep up with a moderately fast typist, even though it's all rendered locally.

onion2k wrote at 2022-02-24 10:05:47:

When I was at university I tinkered with writing a MUD app that ran on telnet over a modem that would echo back inputs. It was incredibly slow once there were more than 4 or 5 users because my code sucked. I can't use that as evidence that modems weren't fast enough.

It seems obvious that how quickly a web page can refresh is entirely down to the skill of the developers who created it. There's nothing inherently slow about a web page - plenty of pages update quickly enough. Browsers are _fast_. Some userspace code is not. If you're using a page that doesn't work fast enough that's down to the developer, not the web itself.

gumby wrote at 2022-02-24 02:03:51:

Does emacs still have slow mode? I haven’t used it since the old TECO days but you could edit quite reasonably with a 150 baud line.

Of course back then you could also use emacs on a printing terminal, which surely won’t work any more.

lallysingh wrote at 2022-02-24 02:37:10:

Why not? Text mode is still well supported

gumby wrote at 2022-02-24 03:45:19:

Slow mode was designed to minimse (which included delaying) display updating. So you could type a bunch and only when you stopped typing would you see the display update to the final state -- intermediate states weren't rendered (so you could type ab<delete><delete>cd and only after you stoped typing would you see "cd"

Printing terminal mode with similar in that it showed only the line you were on, and, IIRC, optionally the line above or below. This worked pretty well in Emacs where search is extremely lightweight and is (for most people at least) the primary cursor movement tool. Of course deletion couldn't display properly so you would get a marker on the screen (#?) to indicate, and then ^L would draw the then current copy of the line.

You could write this support today, and perhaps someone has, but I wonder if it could possibly be worth it.

This same reason is why the unix terminal driver by default starts up with # as the delete character command and @ deleted the whole line. That line discipline dates back to Multics, which had more efficient I/O as the Multics hardware had channel controllers (all our I/O is done with channel controllers these days, though we don't call them that).

Actually, speaking of channel controllers: Emacs had a third mode: if you used the SUPDUP protocol (stood for super duper: roc 739, 746, 749) a bunch of the IO and screen update was done by your local host, not the remote machine on which you were running Emacs (or whatever). It implemented an abstract display and could even interpret a few commands to display the local buffer properly like a souped-up channel controller! This all predated Unix on the arpanet so later approaches went in a different direction.

But remember this was back when you could have dozens of people using networked computers with clock speeds in the hundreds of kilohertz or less, back before the cross continent backbone was upgraded to ~56 Kbps.

ssivark wrote at 2022-02-24 00:57:33:

I understand this was an exploratory project, but from a practical perspective, how does this compare with one of the standard terminal-based editors, running in a GPU accelerated terminal emulator like alacritty/etc?

KennyBlanken wrote at 2022-02-24 05:41:10:

Programs that use the native OS's APIs are "GPU accelerated", too. "GPU accelerated" editors and terminal programs are snake-oil, just another spin on "minimalist" text editors.

Instead, buy a high-refresh-rate monitor with good pixel response time and input latency. Probably 90% of the people reading this post are on a 60hz display, and many of those monitors may have noticeable input delay (Microsoft found that people can notice even just a few ms input/display delay.) Even just 75 hz or 100hz will feel noticeably nicer.

There are a couple of pretty inexpensive ~95hz monitors aimed at office users, but 2.7k ~150hz gaming monitors aren't that expensive if you shop carefully. Even nice ones with IPS and good color gamut/calibration.

Everything is nicer with higher refresh rates. Scrolling through web pages, file listings, editing text files, terminal sessions. Same is true of gaming mice; if you do a lot of mousework, a quality "gaming" wireless mouse is durable, has virtually zero input lag, great tracking, very accurate response to quick "flicks" letting you move windows and such around, and high polling rates. Default on most mice is around 120hz. Yes, you can tell the difference - though I have my mouse set to either 250hz or 500hz when not gaming, I forget which.

Edit: just to note that someone pointed out keyboards matter, too, and they're right Some keyboards can have over 50ms input delay, which when you think about it, is fucking nuts in the year 2022.

RTINGs tests for keyboard latency and their idea of "high latency" for a gaming keyboard can be 10ms. The difference between that and the fastest (1-2ms) keyboards is probably not worth selecting for over features like ergonomics and daily use factors.

beardog wrote at 2022-02-24 05:49:11:

Likewise, keyboards can matter a bit too:

https://danluu.com/keyboard-latency/

KennyBlanken wrote at 2022-02-24 06:27:26:

Very true. Some keyboards have latency twice as bad as the round-trip ping time of a server you might be connecting to half-way across the country, which is...nuts. There really isn't any reason for a keyboard to take _half of a tenth of a second_ to register a keypress.

RTINGs tests for latency, but I would caution folks to note that "high latency" to them is ~8 or 9ms, which as Dan's tests show, can be almost 10x better than some common keyboards out there.

dasKrokodil wrote at 2022-02-24 13:10:49:

Good points!

However, the high-refresh-rate monitor also needs to be supported. I have a 144 hz monitor which works fine with my Windows PC, but when I connect it to my work Macbook, it goes back to 60 hz. In the MacOS display settings, there is a dropdown for the refresh rate, but it only lets me choose between 50 and 60 hz... :(

peakaboo wrote at 2022-02-24 06:26:06:

I see a huge difference using Kitty, it's much faster than other terminals.

vidarh wrote at 2022-02-24 13:19:16:

Kitty is decent, and I've used it a lot, but on my system at least rxvt-unicode beats Kitty's throughput by a factor of 2. If you've got a much faster gpu maybe your mileage may vary, but I can't even tell the performance difference unless I'm timing something like catting a large file to the terminal, which isn't something I do on purpose.

Currently I'm using mlterm which is slower than Kitty again by a significant margin, and I still don't notice the difference in day to day use.

(xterm, st and gnome-terminal are all slower than Kitty by huge margins; of the terminals on my system (I have a lot because I tested a bunch of them a while back...) there's a performance difference in terms of rendering throughput of at least a factor of 10, and yet even the slowest once are perfectly fine for most use other than the cat large file test)

anthk wrote at 2022-02-24 13:26:40:

This. Since XV days, X's 2D rendering has been accelerated by GPU's.

vidarh wrote at 2022-02-24 13:37:19:

You don't even need 2D acceleration for this beyond _maybe_ a basic region blit.

Ever since the Amiga days text rendering at sufficient speed to bitmap displays has been a solved problem, with only scrolling accelerated by the blitter, and resolution and number of bits per pixel has increased at a much slower pace than CPU performance since then.

Most X terminals use one of ~3 different basic approaches to glyph rendering anyway (as shown by redirecting them through xtruss to log the X requests they make), and their throughput still varies from far faster than Kitty (as per rxvt) to far slower, so apart from possibly the fastest ones, most of the x terminals spends most of their time outside of their rendering code.

dahart wrote at 2022-02-24 16:29:29:

Xv doesn’t do any rendering, though, right? It displays video. It can’t be used to implement a text editor’s rendering engine.

user3939382 wrote at 2022-02-24 01:42:18:

The prebuilt executable failed because it couldn’t dynamically load libpng and after I compiled it myself I got a seg fault.

liz3_de wrote at 2022-02-25 00:18:53:

contact me or open a issue on gh if you wanna have help.

Liz3#0001

yann@liz3.net

oauea wrote at 2022-02-24 00:25:47:

3-4 byte unicode characters will not work

That's a tad too simple

mananaysiempre wrote at 2022-02-24 00:31:42:

(Notably 3-byte characters include all or almost all non-astral CJK, which is otherwise stupidly simple to draw.)

krona wrote at 2022-02-24 01:00:51:

This is using FreeType to CPU render to a text atlas, to be clear.

deathanatos wrote at 2022-02-24 02:44:54:

But is this wrong?

I think the current implementation _might_ have some bugs, but why _not_ create an atlas? Especially for code, most of the characters will be in ASCII, and if they're not, you can always push what isn't dynamically as needed. Most of the rendering still occurs GPU side.

I suspect the post's code is also cheating on shaping, but even that seems like something you should be able to at least cache on the GPU?

Like the 98% of cases, the GPU, and if for the 2% oddball stuff, the CPU needs to step in, so what?

(Edit: perhaps I'm projecting the "wrong" part onto your comment; there's a lot of "why?" in these comments and the more I think about it, it honestly seems pretty sensible to push graphics to the *G*PU.)

krona wrote at 2022-02-24 12:40:46:

I'm just commenting on the extent to which it is 'GPU-rendered', which in this case is apparently no more than a standard text rendering facilities provided by the OS.

That's not to say it's not impressive considering how simple it is.

liz3_de wrote at 2022-02-25 00:15:02:

FreeType Renders a atlas which is then uploaded to the gpu within the ascii range, other characters are then lazy loaded into the buffer

dahart wrote at 2022-02-24 04:16:15:

Isn’t that how more or less all editors render text, on the GPU and on the CPU?

Pulcinella wrote at 2022-02-24 05:41:05:

It’s possible to perform the rendering of the font itself on the GPU (e.g. Pathfinder) instead of rendering an atlas on the CPU and sending it over to the GPU, but I’m not sure of any text editors that do so.

dahart wrote at 2022-02-24 05:45:41:

Of course, it’s definitely possible. Just doesn’t seem like there are good reasons to spend time directly rendering the same font characters repeatedly, right? That’d be spending a lot of compute doing something you can pre-render and cache in a texture. ;) This is why a lot of CPU implementations render text atlases too.

Pulcinella wrote at 2022-02-24 18:37:19:

I suppose you could also pre-render to an atlas on the GPU.

I would also say, personally, there is something aesthetically pleasing about rendering vector graphics (fonts, SVG, etc.) on the GPU. “It’s a graphics processing unit, shouldn’t it be handling vector graphics as well?!” :)

dahart wrote at 2022-02-24 18:44:35:

Totally, you could. But in the case of a text editor, that would probably be purely to flex and not because it’s faster or better or easier, right?

People are definitely doing tons of vector rendering on the GPU. And the GPU is fantastic for vector graphics generally, mainly still because of the conversion to pixels. I posted links to Hughes Hoppe’s paper and the NV path rendering library, for example. Those can do font rendering as well, but I still wouldn’t use those for rendering fonts directly or for rendering text atlases because it would be difficult to implement.

Vogtinator wrote at 2022-02-24 09:06:43:

Yep, that's basically what glyph caches do.

cosmotic wrote at 2022-02-23 23:51:22:

Just curious, why GPU rendered?

mananaysiempre wrote at 2022-02-24 00:30:25:

Because it’s very easy for the CPU-GPU interface to become the bottleneck, as state-of-the-art resolutions require shoving around literal gigabytes per second to get a standard framerate (seriously, make a rough estimate how much data your 300 ppi phone or tablet is pumping over the bus to the screen, it’s chilling), and the less you do it the better. (See a HN comment about Audacity slowing down in recent years [as screen resolution increases]:

https://news.ycombinator.com/item?id=26498649

.) Getting a not-quite-SIMD processing unit specializing in throughput above all else to do your heavy lifting is a bonus.

I suspect the underlying question is “why a supposedly-3D-optimized GPU for a 2D task?” While it’s true that modern GPUs are 3D-optimized, that’s AFAICS because the decade-long lull of 1MP-or-so screens and ever-more-powerful desktop CPUs _ca._ 2000–2010(?) made CPU 2D rendering mostly “fast enough” so as the programmable 3D pipeline emerged the 2D accelerator from the workstation era died out, leaving perhaps only an optimized blit behind.

Compute- and power-constrained handheld devices and higher-resolution screens made developers (software _and_ hardware) wake up in a hurry, but now, a redraw-everything-every frame, 3D-adapted graphics facility is what you have, so a redraw-everything-every-frame, 3D-adapted graphics facility is what you shall use. The much more parallel and power-efficient processor and wider bus is still easily worth it, if you spend the effort to wrangle it.

(It’s interesting to think what 2D-optimized hardware would look like. Do people know how to do analytic or even just AGG-quality 2D rasterization on a GPU? Or anything but simple oversampling.)

Not a graphics programmer, treat with a measure of skepticism.

ppier wrote at 2022-02-24 03:23:27:

Raph Linus has recently has his work posted on HN, which features GPU rasterization [1]

Also, Pathfinder [2] has the ability to do most of its rasterization on the GPU.

I read somewhere that an experimental renderer for Google’s Fuchsia (Spinel?) [3] also does this.

1:

https://github.com/linebender/piet-gpu

2:

https://github.com/servo/pathfinder

3:

https://www.tdcommons.org/cgi/viewcontent.cgi?article=1580&c...

bityard wrote at 2022-02-24 04:44:22:

I have been using computers on a daily basis since the mid 80's. I have seen many applications that stretched or exceeded the capabilities of the machine I was running them on, resulting in sub-par performance that left me wishing for a hardware upgrade to make using the application a more pleasant experience.

None of them were text editors or terminal emulators.

vidarh wrote at 2022-02-24 12:53:46:

Notably there's something like an order of magnitude or more difference in performance between terminals that people are perfectly happy to use. (EDIT: To quantify this: on my system, catting the same file to a window taking about half my screen took an average of 1.443s with st, 1.16 seconds with xterm, 0.165s with rxvt xterm, and 0.404s with Kitty - all average over 3 runs)

Very few non-gpu accelerated terminals get anywhere near maximising performance and they're still fast enough that most people don't even notice because most applications, including most editors, once even remotely optimized themselves, don't tend to push even a non-accelerated terminal very hard.

Put another way: a typical test case of the performance of a terminal tends to be spewing log output or similar to it at a rate where the obvious fix is to simply decouple the text buffer from the render loop because nobody can read it all anyway (but really, a lot of the developers of these terminals should start by looking at rxvt and at least get to a decent approximation first before they start doing those kinds of shortcuts)

Fast enough terminal rendering was solved by the mid 1980s, and performance increases have outpaced resolution increases by a substantial factor.

dahart wrote at 2022-02-24 05:42:42:

> why a supposedly-3D-optimized GPU for a 2D task […] 3D-adapted graphics facility is what you shall use.

Modern GPUs & so-called 3D APIs don’t really do all that much that is 3D specific. @greggman likes to talk about WebGL as a fundamentally 2D API

https://webglfundamentals.org/webgl/lessons/webgl-2d-vs-3d-l...

The bulk of the 3D work is still 2D, filling all the pixels, rasterizing 2D triangles after they’ve been transformed, projected to 2D, and clipped. Shaders might have lots of 3D math, but that’s really just pure math and no different from 2D math as far as the GPU is concerned.

> Do people know how to do analytic or even just AGG-quality 2D rasterization on a GPU? Or anything but simple oversampling.

Yes, I think so - if you mean quads & triangles.

The problem, of course, is analytic filtering (even 2D) usually isn’t worth the cost, and that oversampling is cheap and effective and high enough quality for most tasks. (Still, adaptive oversampling and things like DLSS are popular ways to reduce the costs of oversampling.) The path rendering people think carefully about 2d path rendering though:

https://hhoppe.com/ravg.pdf

https://developer.nvidia.com/gpu-accelerated-path-rendering

And don’t forget mipmapped texture sampling, even in 2D, is better than oversampled.

A very common shader trick for 2d antialiasing that’s better than oversampling (not analytic, but sometimes damn close) is to use the pixel derivatives to compute the edges of a mask that blends from opaque to transparent in ~1-pixel.

https://www.shadertoy.com/view/4ssSRl

zozbot234 wrote at 2022-02-24 00:57:04:

> Do people know how to do analytic or even just AGG-quality 2D rasterization on a GPU?

I suppose the sensible answer is "do it in a compute shader if you care about pixel-perfect accuracy". Which you arguably should for 2D stuff, given that the overhead is low enough.

adastra22 wrote at 2022-02-24 00:54:33:

GPUs used to be 2D oriented. You’re making me feel very old now, lol.

zozbot234 wrote at 2022-02-24 01:00:32:

Sure, but did that "2D" orientation boil down to anything beyond accelerated blitting and perhaps a few geometry-rendering primitives? Part of the problem is also that there never was a standard feature set and API for 2D acceleration comparable to OpenGL or Vulkan. So support for it was highly hardware-dependent and liable to general bit rot.

adastra22 wrote at 2022-02-24 02:45:40:

Yes, there were other hardware acceleration tricks implemented. For example, hardware sprites allowed for bitmaps that were composed in real time for display rather than blitted to the framebuffer. This trick is still used for the mouse cursor. There was hardware scrolling which meant you could move the whole display and only render the edge that came into view. Both of these together is how platformer games were implemented on the SNES and similar hardware of that generation, and it’s why the gameplay was so smooth. The NES could either do hardware sprites or hardware scrolling, but not both IIRC, which is why the game world freezes when Link reaches the edge of the screen and the new screen comes into view in the original Zelda.

There were other hardware accelerations my memory is somewhat more vague on. I remember there were some color palate translation hardware, and hardware dithering.

There wasn’t any standard cross platform graphics api back then, but that’s more a statement about the era. Everyone wrote directly to the metal.

zozbot234 wrote at 2022-02-24 03:00:04:

> For example, hardware sprites allowed for bitmaps that were composed in real time for display rather than blitted to the framebuffer.

You can do all these things with compositing, though. A "sprite" is just a very limited hardware surface where compositing happens at scanout, and some modern GPU's have those too.

CodeArtisan wrote at 2022-02-24 01:48:44:

VESA had one but it never took off.

https://web.archive.org/web/20081209121702/http://www.vesa.o...

cosmotic wrote at 2022-02-24 03:07:12:

I guess I'm just looking for some evidence to support the decision to use GPU. Show some framerate benchmarks, some CPU usage comparisons, etc.

dahart wrote at 2022-02-24 04:28:17:

Why does it need “evidence”? GPUs are fast and good at displaying pixels. Moving rendering to the GPU lets the CPU focus on things it’s good at rather than bog it down with a high bandwidth task that has to go to the GPU anyway. Lots of editors have been getting GPU acceleration (I use Sublime and Visual Studio among others, both have hardware acceleration). All major browsers and operating systems support hardware accelerated drawing. Video games of course… I think the question is why wouldn’t you use the GPU for rendering these days? It’s a pixel processor that nearly everyone has in their machine. What evidence is there to support the decision to render pixels on a CPU rather than use the rendering co-processor?

anthk wrote at 2022-02-24 13:28:49:

Because that's snake oil. I had accelerated 2D since XV days (XVideo extension for X) back in 2003.

dahart wrote at 2022-02-24 15:15:03:

I’m very confused by that comment, what do you mean snake oil? Are you saying GPUs don’t do anything? Isn’t the XV you’re referring to in the same category as today’s GPU wrt parent’s comment? @cosmotic asked for evidence why not to use the CPU, and XV is not implemented on the CPU.

Also it looks like XV did video resizing and some color mapping (

https://en.wikipedia.org/wiki/X_video_extension

). Today’s GPUs are doing the rendering. Filling the display buffer using primitives and textures, and outputting the display buffer are different activities that are both done by the GPU today, but it sound like XV didn’t do the first part at all, and that first part is what we’re talking about here.

anthk wrote at 2022-02-25 14:38:48:

Ok, maybe not XV, but X11/Xorg was 2D accelerated back in the days, for sure.

X without 2D acceleration was crawlish even under my AMD Athlon.

cosmotic wrote at 2022-02-24 16:49:13:

If GPUs are good at displaying pixels, the benchmarks and evidence should be easy to come by. As I mentioned in another comment, I use iTerm2 and it's GPU acceleration has no visual impact but makes CPU usage much higher while sitting idle. Turning it off is a huge improvement.

dahart wrote at 2022-02-24 18:39:17:

Evidence is super easy to come by, but the question does need to be specific and well formed (what kind of rendering, exactly, are we comparing, how many pixels, what’s the bottleneck, etc.). There are loads and loads of benchmarks demonstrating GPUs are faster than CPUs at turning triangles into pixels. Not just a little faster, the numbers are usually in the ~100x range. There’s literally zero contention on this point, nobody is questioning whether simple rendering cases might be faster. Nobody is playing Fortnite with CPU rendering. Because this question is so well settled, GPU rendering is so ubiquitous that it’s even hard to test CPU software rendering.

There are certain kinds of rendering and corner cases where CPUs can have an edge, but those happen in areas like scientific visualization or high end VFX, they don’t come up often in text editor rendering engines.

You can’t use your anecdote of 1 app that might have a broken implementation to question GPUs categorically. I’ve never noticed iTerm2 using significant CPU. My iTerm2 sits at 0.0% CPU while idle. Maybe your install is busted?

stonogo wrote at 2022-02-24 00:26:36:

perhaps the answer can be found in the 'motivation' section of the linked website.

  "The base motivation was just simply that i wanted to have a 
   look into OpenGL and doing GPU accelerated things, i did not 
   plan to create a text editor from the get go.
   After starting to experiment a bit i decided to call this a 
   small side Project and implement a proper editor."

zaptheimpaler wrote at 2022-02-24 01:03:25:

Just one more answer - why not? It seems to me that GPUs are heavily underutilized most of the time a PC is running. They kick in only in a few rare applications like gaming/rendering/video decoding and run at literal 0% utilization the other times (atleast according to what windows task manager tells me).

On the other hand, the CPU is often busy. Particularly for a text editor, theres a good chance the user is running all sorts of CPU heavy stuff like code analysis tools, databases, compile loops etc.

electroly wrote at 2022-02-24 06:22:44:

You should see at least a little bit of GPU usage for desktop composition. Try watching Task Manager while moving a window around or scrolling.

cosmotic wrote at 2022-02-24 03:05:49:

iTerm2 has a setting to use GPU and it makes it consume _tons_ of CPU. Turning it off has no apparent adverse effect.

nomel wrote at 2022-02-24 00:15:50:

Why would you paint pixels with a CPU when you have a GPU that's optimized for it? :)

vidarh wrote at 2022-02-24 16:45:02:

That depends on how much of the effort requires CPU work in order to transfer data to the GPU relative to the work that will actually be done on the GPU. For text rendering it's possible but not a given you'll save all that much, depending on how fast your GPU is, and how big you're rendering the glyphs, and whether or not you're applying any effects to them.

For most text-heavy apps there's little reason for the text-rendering to be a bottleneck either way.

dahart wrote at 2022-02-24 19:23:14:

While you’re right that it does depend, the imbalance is so large in practice that it’s extremely difficult to find a workload where CPU wins, especially when you’re talking about 2d screen rendering. The bottlenecks are: per-pixel compute multiplied by the number of pixels, and bandwidth of transferring data to the GPU (must include this when doing “CPU” rendering).

What does CPU rendering even mean? Is it sending per-pixel instructions to the GPU, or is it rendering a framebuffer in CPU ram, and transferring that to the GPU or something else? Pure software would be the latter (save pixels to a block of RAM, then transfer to the GPU), but the line today isn’t very clear because most CPU renderers aren’t transferring pixels, they’re calling OS level drawing routines that are turned into GPU commands. Most of the time, CPU rendering is mostly GPU rendering anyway.

> there’s little reason for the text-rendering to be a bottleneck either way.

Think about 4K screens, which is up to 8M pixels to render. Suppose you’re scrolling and want 60fps. If you wanted to rendering this to your own framebuffer in RAM, and you render in 24 bit color, then your bandwidth requirement is 8M * 3bytes * 60frames = 1.4GB/s bandwidth. It’s also 8M * 60 pixel rendering operations. Even if the per-pixel rendering was 1 single CPU instruction (it’s more, probably a lot more, but suppose anyway) then the CPU load is 0.5 billion instructions per second, which is a heavy load even without any cache misses. Most likely, this amount of rendering would consume 100% of a CPU core at 4K.

Anyway, there’s no reason to have the CPU do all this work and to fill the bus with pixels when we have hardware for it.

vidarh wrote at 2022-02-24 20:10:24:

> While you’re right that it does depend, the imbalance is so large in practice that it’s extremely difficult to find a workload where CPU wins, especially when you’re talking about 2d screen rendering. The bottlenecks are: per-pixel compute multiplied by the number of pixels, and bandwidth of transferring data to the GPU (must include this when doing “CPU” rendering).

Text rendering is pretty much _the_ best case for the CPU in this respect, because rendering inline to a buffer while processing the text tends to be pretty efficient.

> they’re calling OS level drawing routines that are turned into GPU commands. Most of the time, CPU rendering is mostly GPU rendering anyway.

This is true, and is another reason why it's largely pointless for text-heavy applications to actually use OpenGL etc. directly. But what I've said also applies in purely software rendered system. I've terminal code with control of the rendering path for systems ranging from 1980's hardware via late 90's embedded platforms to modern hardware - there's little practical difference; from the mid 1980's onwards, CPUs have been fast enough to software render terminals to raw pixels just fine, and typical CPU performance has increased faster than the amount of pixels a typical terminal emulator needs to push.

>> > there’s little reason for the text-rendering to be a bottleneck either way.

> Think about 4K screens, which is up to 8M pixels to render.

I stand by what I said. I'd recommend trying to benchmark some terminal applications - only a handful of the very fastest ones spend anywhere near a majority of their time on rendering even when doing very suboptimal rendering via X11 calls rather than rendering client side to a buffer. Of the ones that do the rendering efficiently, the only case where this becomes an issue is if you accidentally cat a massive file, and only then if the terminal doesn't allow.

> Most likely, this amount of rendering would consume 100% of a CPU core at 4K.

So let it. You're describing an extreme fringe case of shuffling near-full lines as fast as you can where you can still trivially save CPU if you care about it by throttling the rendering and allowing it to scroll more than one line at a time, which is worth doing anyway to perform well on systems under load or without a decent GPU, or just slow systems (doing it's trivial: decouple the text buffer update from the render loop, and run then async - this is a ~30-40 year old optimisation).

Yes, it looks ugly when you happen to spit hundreds of MB of near-full lines of text to a terminal when the system is under load. If you care about that, by all means care about GPU rendering, but in those situations about the only thing people tend to care about is how quickly ctrl-c makes it stop.

Try tracing what some terminal emulators or editors are actually doing during typical use. It's quite illuminating both about how poorly optimised most terminals (and text editors) are, and how little of their time they tend to spend rendering.

> Anyway, there’s no reason to have the CPU do all this work and to fill the bus with pixels when we have hardware for it.

A reason is simplicity, and performing well on systems with slow GPUs without spinning up fans all over the place, and that the the rendering is rarely the bottleneck so spending effort accelerating the rendering vs. spending effort speeding up other aspects is often wasted. _Especially_ because as you point out many systems optimise the OS provided rendering options anyway. E.g. case in point: On my system rxvt is twice as fast as Kitty. The latter is "GPU accelerated" and spins up my laptop fans with anything resembling high throughput. Rxvt might well trigger GPU use in Xorg - I haven't checked - but what I do know is it doesn't cause my fans to spin up and it's far faster despite repeated X11 requests. A whole lot of the obsession with GPU acceleration in terminal emulators and text editors is not driven by benchmarking against the fastest alternatives, but cargo-culting.

dahart wrote at 2022-02-24 21:08:40:

> A whole lot of the obsession with GPU acceleration in terminal emulators and text editors is not driven by benchmarking against the fastest alternatives, but cargo-culting.

It seems strange to demand evidence and benchmarking but end with pure unsupported hyperbolic opinion. If you can’t explain why a huge number of major text editors are now starting to support GPU rendering directly, nor why the OS has already for a long time been using the GPU instead of the CPU for rendering, then it seems like you might simply be ignorant of their reasons, unaware of the measurements that demonstrate the benefits.

I really don’t even know what were talking about anymore exactly, since you moved the goal posts. I don’t disagree that most of the time text rendering isn’t consuming a ton of resources, until it does. I don’t disagree that many editors aren’t exactly optimized. None of that changes the benefits of offloading framebuffer work to the GPU. The ship sailed already, it’s too late to quibble. I can’t speak to rxvt vs Kitty (you really need to do a comparison of features and their design decisions and team sizes and budgets before assuming your cherry picked apps reflect on the hardware in any way), but by and large everyone already moved to the GPU and justifications are pretty solidly explained and understood, evidence for these decisions abounds if you care to look for it.

vidarh wrote at 2022-02-24 22:53:28:

I have not demanded evidence. I have suggested that you try benchmarking and measuring this for yourself, because it would demonstrate very clearly the points I've been making. I don't need evidence from you, because I've spent a huge amount of time actually testing and profiling more terminals and editors than most people can name.

> If you can’t explain why a huge number of major text editors are now starting to support GPU rendering directly

I did give a reason: It's largely cargo-culting. _Some_ have other reasons. E.g. using shaders to offer fancy effects is a perfectly valid reason if you want those effects. Most of these applications do not take advantage of that. Some, like the linked one, are learning experiences, and that's also a perfectly fine justification. Very few have put any effort into actually benchmarking their code to a terminal which is actually fast - most of the performance claims I've seen comes from measuring against slow alternatives.

Hence why "cherry picking" rxvt: The point being to illustrate that when someone can't even match rxvt, which is open source and so easy to copy and at least match in performance, maybe they should actually figure out why rxvt is fast first before worrying about the specific low level rendering choice.

One of the things people would learn from that is that the rendering choice is rarely where the bottlenecks arise.

> then it seems like you might simply be ignorant of their reasons, unaware of the measurements that demonstrate the benefits.

I've done many enough code reviews and enough measurements and benchmarking of terminals and text rendering code over the years that I'm quite content in my knowledge that I know the performance characteristics of this better than most of those who have implemented the large number of poorly performing terminals over the years. That's not a criticism of most of them - performance on text rendering is simply such a minor aspect of a terminal because most of them are _fast enough_ that it's rarely priority. What this has taught me is that the rendering itself isn't typically the bottleneck. It hasn't been the bottleneck for most such projects since the 1980's. You can continue to disbelieve that all you want. Or you can try actually looking at some of this code yourself and try profiling it, and you'll see the same thing. Your choice.

> I really don’t even know what were talking about anymore exactly, since you moved the goal posts.

I did no such thing. In the comment I made that you replied to first I made two claims:

1. that it is not a given GPU use will be faster.

2. that for most text-heavy apps there's little reason for the text-rendering to be a bottleneck either way.

I stand by those.

> I don’t disagree that most of the time text rendering isn’t consuming a ton of resources, until it does.

My point is that the "until it does" is largely irrelevant, as even the most pathological case for a terminal is easily accommodated and in practice almost never occurs. And as you pointed out a lot of OS text rendering support is accelerated anyway (or will get there), which makes this largely moot anyway by making it a system concern rather than something an application ought to put effort into dealing with.

> I don’t disagree that many editors aren’t exactly optimized. None of that changes the benefits of offloading framebuffer work to the GPU.

The point is that the failure of prominent "accelerated" terminals to even match rxvt is a demonstration that spending the effort to write code that is far less portable is a poor tradeoff and tend to show that people writing these terminals rarely understand where the actual bottlenecks are.

> but by and large everyone already moved to the GPU

This is only true to the extent you're calling running on top of OS API's that have been accelerated as "moving to the GPU". The vast majority of terminal emulators and editors are still relying on generic APIs that may or may not be accelerated depending on where they're running. Only a small minority are targeting GPU specific APIs. In this context when people are talking about GPU accelerated editors or terminals, they're talking about the latter.

chrisseaton wrote at 2022-02-24 00:29:08:

Don't all serious desktop environment render glyphs using the GPU anyway? So don't you get that for free when you use the standard frameworks?

floatboth wrote at 2022-02-24 02:11:57:

Gtk4 and QML sure do (QML still (?) uses SDFs, Gtk4 uses atlas textures like WebRender does). Older frameworks don't.

jcelerier wrote at 2022-02-24 08:29:30:

The default in QML used to be atlases, you can revert back to them with the following:

      renderType: Text.NativeRendering

or:

      QQuickWindow::setTextRenderType

myfavoritedog wrote at 2022-02-24 00:48:04:

Standard frameworks give you text frames with minimal hooks to customize them and "good enough" performance. They're great for dumping a log file to or for small documents, but when you want the fine control of a text editor that can handle big files, you need to roll your own.

kazinator wrote at 2022-02-24 01:01:23:

GPU acceleration of the text editing viewport, which will not show much more than 100 lines of text at any time, has almost nothing to do with the issues in editing large files.

ben-schaaf wrote at 2022-02-24 00:02:53:

Can't speak for this project, but you simply can't get good performance at high resolutions (4k+) using CPU rendering.

sidpatil wrote at 2022-02-24 02:09:51:

I suppose "good performance" is subjective. I run VS Code on a 4k (well, 3840-by-2160) monitor attached to a ThinkPad T440, and the performance is acceptable. The CPU does get busier, but not by much.

Though, the biggest limitation is that the framerate is only 30Hz since the HDMI version is older on that laptop, so maybe that has something to do with it as well.

zamadatix wrote at 2022-02-24 02:39:48:

VS Code certainly uses GPU accelerated rendering, it's an Electron app after all. Though yes only needing to hit 30 FPS to not be the bottleneck in your case does make things a bit moot.

dahart wrote at 2022-02-24 04:36:12:

> The CPU does get busier, but not by much.

Try again, with --disable-gpu.

vidarh wrote at 2022-02-24 12:59:41:

You could get good performance with CPU rendering on machines several orders of magnitude slower than what we had today to screens that certainly did not have as many orders of magnitude fewer pixels.

pcwalton wrote at 2022-02-24 00:04:15:

It's the right choice. Otherwise you're wasting power for no reason. The only reason not to use the GPU is driver bugs, usually on Linux, that cause it to use more energy than the CPU.

fulafel wrote at 2022-02-24 06:21:29:

Learning, as it reads on the page. GPU programming is hard, and splitting work between CPU and GPU is too, at least if you try to make the program go faster.

beepbooptheory wrote at 2022-02-24 01:44:14:

I was ready to be critical and say that "bullshit" is relative, that the interface makes the user, etc., but then I saw the emacs-like navigation and now its hard to be too critical. Looks fine!

Anyway, just curious: if I use vim in a GPU-rendered shell like kitty, is it pretty much the same as this in terms of GPU-related benefits? Or is this supposed to be better?

anthk wrote at 2022-02-24 12:50:10:

Cairo, XCB and X extensions did hardware acceleration since 2D days (and now 3D with almost any backend API since Cairo). This is snake oil.

imachine1980_ wrote at 2022-02-24 11:39:32:

It sound like micro text editor take a look have incredible ux

zaps wrote at 2022-02-24 00:20:41:

Language

johnisgood wrote at 2022-02-24 08:34:57:

Not Rust, sorry.