________________________________________________________________________________
From the FAQ, before someone asks the obvious:
Why shouldn't I just use jq?
jq is awesome, and a lot more powerful than gron, but with that power comes complexity. gron aims to make it easier to use the tools you already know, like grep and sed.
gron's primary purpose is to make it easy to find the path to a value in a deeply nested JSON blob when you don't already know the structure; much of jq's power is unlocked only once you know that structure.
In simpler words, as a user: The jq query language [1] is obtuse, obscure, incredibly hard to learn if you need it for quick one liners once in a blue moon. I've tried, believe me, but I should probably spend that much effort learning Chinese instead.
It's just operating at the wrong abstraction level, whereas gron is orders of magnitude easier to understand and _explore_.
1:
https://stedolan.github.io/jq/manual/
> In simpler words, as a user: The jq query language [1] is obtuse, obscure, incredibly hard to learn if you need it for quick one liners once in a blue moon.
I don't agree that jq's query language is obtuse. It's a DSL for JSON document trees, and it's largely unfamiliar, but so is xpath or any other DOM transformation language.
The same thing is said about regex.
My take is that "it's obtuse" just translates to "I'm not familiar with it and I never bothered to get acquainted with it".
One thing that we can agree though is that jq's docs are awful at providing a decent tutorial for new users to ramp up.
I'm pretty good with regular expressions. I have spent a lot of time trying to get familiar with jq. The problem is that I never use it outside of parsing JSON files, yet I use regular expressions all over the place: on the command line, in Python and Javascript and Java code. They are widely applicable. Their syntax is terse, but relatively small.
jq has never come naturally. Every time I try to intuit how to do something, my intuition fails. This is despite having read its man page a dozen times or more, and consulted it even more frequently than that.
I've spent 20+ years on the Unix command line. I know my way around most of it. I can use sed and awk and perl to great effect. But I just can't seem to get jq to stick.
Aside, but there's a lot of times when "I know jq can do this, but I forget exactly how, let me find it in the man page" and then... I find jq's man page as difficult as jq itself when trying to use it as a reference.
Anyway, $0.02.
Edited to add: as a basic query language, I find it easy to use. It's when I'm dealing with json that embeds literal json strings that need to be parsed as json a second time, or when I'm trying to manipulate one or more fields in some way before outputting that I struggle. So it's when I'm trying to compose filters and functions inside jq that I find it hard to use.
I think it's more like "I'm not familiar with it and getting it to do something that seems like it should be easy is surprisingly hard, even though I'm putting in some effort." I've become pretty good at jq lately, but for several years before that I would occasionally have some problem that I knew jq could solve, and resolved to sit down and _learn the damn thing already_, and every time, found it surprisingly difficult. Until you get a really good understanding of it (and good techniques for debugging your expressions), it's often easier just to write a python script.
I love jq, and without detracting from it, gron looks like an extremely useful, "less difficult" complement to it.
Adding: in fact, gron's simplicity is downright inspired. It looks like all it does is convert your json blob into a bunch of assignment statements that have the full path from root to node, and the ability to parse that back into an object. Not sure why I didn't think of that intermediate form being way more greppable. Kudos to the author.
Just as an example, this just took me about a minute to get the data I wanted, whereas I probably spent a half an hour on it yesterday with jq:
curl -s https://static01.nyt.com/elections-assets/2020/data/api/2020-11-03/national-map-page/national/president.json | gron | grep -E 'races.*(leader_margin_votes|leader_margin_name_display|state_name)' | grep -vE 'townships|counties' | gron -ungron
If it had a name that didn't collide horribly with jQuery in search, I think it would be fine.
But I have tried to learn jq's syntax (it's pretty much a minilanguage) and it has been incredibly difficult.
I also remember what when I first tried learning regex it was also very difficult. That is _until_ I learned about finite state machines and regular languages, after that CS fundamentals class I was able to make sense of regex in a way that stuck.
Is there a comparable theory for jq's mini-language?
Not a theory per se, but my "lightbulb moment" with jq came when I thought about it like this:
jq is basically a templating language, like Jsonnet or Jinja2. What jq calls a "filter" can also be called a template for the output format.
Like any template, a jq filter will have the same structure as the desired output, but may also include dynamically calculated (interpolated) data, which can be a selection from the input data.
So, at a high level, write your filter to look like your output, with hardcoded data. Then, replace the "dynamic" parts of the output data with selectors over the input.
Don't worry about any of the other features (e.g. conditionals, variables) until you need them to write your "selectors."
YMMV, but that's what's worked for me
I don't know of any formal theory, but it feels a bit like functional programming because you don't often use variables (an advanced feature, as the manual says). I kind of got a feel for it by realizing that it wants to push a stream of objects through transformations, and that's about it. A few operators/functions can "split" the stream, or pull the stream back into place. Like, uh,
in.json {"a":1,"b":2}
jq -c '{a}' in.json
{"a":1}
The . is the current stream, so if I just do ". , .", it's kind of pushing two streams along:
jq -c '.,. | {a}' in.json
{"a":1}
{"a":1}
Then, of course, say:
jq -c '{a, b, c: .}' in.json
{"a":1,"b":2,"c":{"a":1,"b":2}}
It was going through the . stream, and I pulled the . stream right back in while doing so.
So it kind of helps to keep straight in my head when I've kind of got multiple _streams_ going, vs multiple values.
Someone (almost anyone) can probably explain better with formal theory, but I just kind of got a feel for it and kind of describe it like this.
I spend a decent amount of time at the command line wrangling data files. It's fun for me to get clever with other tools like awk and perl when stringing together one liners and I enjoy building my knowledge of these tools, but jq has just never stuck.
Is it possible that you learned awk and Perl when you were but a child, and now your aging system is becoming read-only?
- a fellow read-only system
Quite possibly, I did first play with Perl about 15 years before encountering jq. Some days I do feel as though my head is simply out of room, as my brain has been replaced by a large heap of curly braces, semi colons and stack traces.
read-only - getting older it sure feels like that.
I mean, I've been using grep and sed for 15 years now and I still struggle with anything beyond matching literals since they use a "nonstandard" regexp syntax and GNU and BSD variants behave very differently making for a lot of bugs on scripts that need to work on both Linux and MacOS (of coure you can install GNU on macos and BSD on linux, but the whole advantage of bash scripts is that you assume certain things are installed on the user's system and if you can't satisfy that assumption you may as well use Python or similar). I think gron has value for those simpler grep cases, but for anything beyond that, jq is the way to go (incidentally I'm very dissatisfied with all of the tools that aspire to be "jq for yaml" or even the relative dearth of tools for converting YAML to JSON on the command line).
>_GNU and BSD variants behave very differently making for a lot of bugs on scripts that need to work on both Linux and MacOS_
Perl shines for this use case (assuming it is present in the machines you are working with). It is slower than grep/sed/awk for most cases, but it is more powerful and better portable across platforms.
>_converting YAML to JSON on the command line_
check out
https://github.com/bronze1man/yaml2json
> Perl shines for this use case (assuming it is present in the machines you are working with). It is slower than grep/sed/awk for most cases, but it is more powerful and better portable across platforms.
Agreed.
For better or worse, when performance is not a concern in my scripts, I just shell out to "perl -pe" rather than trying to deal with grep, sed or awk.
It just works.
While it's true that jq's DSL has a bit of a learning curve, being able to try expressions and see immediate feedback can help immensely.
Here is a demo of a small script I wrote that shows jq results as you type using FZF:
https://asciinema.org/a/349330
(link to script is in the description)
It also includes the ability to easily "bookmark" expressions and return to them so you don't have to worry about losing up an expression that's _almost_ working to experiment with another one.
As a jq novice, I've personally found it to be super useful.
I gave up on jq and use jello[1][2] instead. gron too looks nice.
[1]
https://blog.kellybrazil.com/2020/03/25/jello-the-jq-alterna...
[2]
https://github.com/kellyjonbrazil/jello
Thanks - I'm glad to find out that I'm not the only one that struggled with jq.
It's just an array programming language. Not everything has to be C, and I think it's unfair to call a language obtuse and incredibly hard to learn just because you're not used to the style.
Even simpler: _gron_ is a tool made in the spirit of Unix while _jq_ is not.
I regularly use jq to summarize the structure of big JSON blobs, using the snippet written here (I alias it to "jq-structure"):
https://github.com/stedolan/jq/issues/243#issuecomment-48470...
For example, against the public AWS IP address JSON document, it produces an output like
$ curl -s 'https://ip-ranges.amazonaws.com/ip-ranges.json' | jq -r '[path(..)|map(if type=="number" then "[]" else tostring end)|join(".")|split(".[]")|join("[]")]|unique|map("."+.)|.[]' . .createDate .ipv6_prefixes .ipv6_prefixes[] .ipv6_prefixes[].ipv6_prefix .ipv6_prefixes[].network_border_group .ipv6_prefixes[].region .ipv6_prefixes[].service .prefixes .prefixes[] .prefixes[].ip_prefix .prefixes[].network_border_group .prefixes[].region .prefixes[].service .syncToken
This plus some copy/paste has worked pretty well for me.
Hey! That's kind of how how I use the CLI (API?) at AWS. It works pretty well! And fortunately (for me), not too much thinking involved.
BTW: I have a D3 front-end dashboard/console for the app (not admin) that makes this a little bit harder, but D3 is pretty organized (and well-documented), if you can figure out what you are trying to do with it.
Wow, it looks so easy, why doesn't everybody do that? /s
That jq query looks like an unwise Perl one-liner.
Yes, it is arguably an unholy contrivance, but someone's already written it, and invoking it as a shell alias or likewise _is_ both easy and useful.
$ jq-structure my-file.json
It feels like finding a deeply nested key in a structured document is a job for XPath. Most people including myself until recently ignore that XPath 3.1 operates on JSON.
FWIW, my two cents:
I like that jq's query expression syntax is command line (bash) friendly. My hunch is that xpath expressions would be awkward to work with.
I've done too much xpath, xquery, xslt, css selectors. For my own work (dog fooding), I settled on mostly using very simple globbing expressions. Then use the host language's 'foreach' equiv for iterating result sets.
Globbing's double asterisk wildcard is the feature I most miss in other query engines.
https://en.wikipedia.org/wiki/Glob_%28programming%29
Looping back to command line xpath: there's always some impedance match between the query and host languages. IIRC, one of the shells, like chubot's oilshell or fish?, has more rational expression evaluation (compared to bash).
You especially see this with regexs. It's a major language design fail that others haven't adopted Perl's first class regex intrinsics. C# has LINQ, sure. But that's more xquery than xpath. And I've never liked xquery.
In other words, "blue collar" programming languages should have intrinsic path expressions. Whatever the syntax.
YMMV.
Is there a command line tool that supports XPath on JSON?
Yes, xidel. The author hangs out here.
I use xmllint for html and xml. I don't think it supports json
Most people ignore XPath.
Can jq do what gron does?
Technically no, because it offers no comparable way to interface to the line-based world of unix tools for interop.
Practically, most things you'd do with gron and grep, sed, awk, ... you could do using only jq as well. Jq comes with massive cognitive overhead though and has a bunch of very unpleasant gotchas (like silently corrupting numbers with abs > 2^53, although very recent jq graciously no longer does that iff you do no processing on the number).
I find jq pretty useful, but I have no love for it.
Actually, I think it might be possible to implement gron in jq (you can produce "plaintext" not just json, and the processing facilities jq offers _might_ be powerful enough to escape everything appropriately, but it's not something I'm curious enough to find out to try).
> Can jq do what gron does?
It really depends on what you want to do, and thus what think gron does.
If all you want to do is search for properties with a given value then yes, jq does that very well.
Unlike gron, jq even allows users to output search results as valid json docs. Hell, jq allows users to transform entire JSON docs.
However, if all you want to do is expand the JSON path at each symbol then I don't know if jq supports that usecase. But then again, why would anyone want to do that?
The linked readme demonstrates gron outputting search results as json.
I think you should just invest the hour or so it takes to learn jq. Yes, it's far from a programming language design marvel. But it covers all of the edge cases, and once you learn it, you can be very productive. (But, the strategy of "copy paste a oneliner from Stackoverflow the one time a year I need it" isn't going to work.)
I think structured data is so common now, that you have to invest in learning tools for processing it. Personally, I invested the time once, and it saves me every single day. In the past, I would have a question like "which port is this Pod listening on", and write something like "kubectl get pod foo -o yaml | grep port -A 3". Usually you get your answer after manually reading through the false-positives. But with "jq", you can just drive directly to the correct answer: "kubectl get pod foo -o json | jq '.spec.containers[].ports'"
Maybe it's kind of obtuse, but it's worth your time, I promise.
How about a tool which output nicely formatted JSON with every line annotated with a jq expression to access that value.
But then I squint a little bit at the default gron (not ungron) output, and that's actually what I see.
But how do you get that '.spec.containers[].ports'?
It seems to me that for your example use case, gron is at least useful to first understand the json structure before making your jq request. And, for simple use cases like this one, enough to replace jq altogether.
Well, the schema of the JSON is something you have to come up with on your own. I happen to have seen like 8 trillion pod manifests so I know what I'm looking for, but if you don't, you have to figure out the schema in some other way. To reverse engineer something, I usually pipe into keys (jq keys, jq '.spec | keys', jq '.spec.containers[] | keys', etc.)
For Kubernetes specifically, "kubectl explain pod", "kubectl explain pod.spec", etc. will help you find what you're looking for.
> Well, the schema of the JSON is something you have to come up with on your own. I happen to have seen like 8 trillion pod manifests so I know what I'm looking for, but if you don't, you have to figure out the schema in some other way.
Well, or you just do
kubectl get pod pod -o json | gron | grep port
and you will get the answer to the original question + the path.
Better to learn xidel
http://www.videlibri.de/xidel.html
which is standards based and more sane to read.
I love that this handles big numbers without modifying them unlike `jq` -
https://github.com/stedolan/jq/issues/2182
$ echo "{\"a\": 13911860366432393}" | jq "." { "a": 13911860366432392 } $ echo "{\"a\": 13911860366432393}" | gron | gron -u { "a": 13911860366432393 }
I can now happily uninstall `jq`. I've been burned by it way too many times.
ouch, I did not know that! thanks for the warning, need to check if my installed version has the fix already.
I don't find this grep-able at all:
json[0].commit.author.name = "Tom Hudson";
Now I need to escape brackets and dots in regex. Genius!
I have 5 line (!) jq script that produces this:
json_0_commit_author_name='Tom Hudson'
This is what I call grep-able. It's also eval-able.
What if there's json object with commit and json_commit?
Then I'll use jq to filter it appropriately or change delimiter. The point is ease of use for grep and shell.
I think this is a good point. It's definitely hard to grep for certain parts of gron's output; especially where there's arrays involved because of the square brackets. I find that using fgrep/grep -F can help with that in situations where you don't need regular expressions though.
It's not an ideal output format for sure, but it does meet some criteria that I considered to be desirable.
Firstly: it's unambiguous. While your suggested format is easier to grep, it is also lossy as you mention. One of my goals with gron was to make the process reversible (i.e. with gron -u), which would not be possible with such a lossy format.
Secondly: it's valid JavaScript. Perhaps that's a minor thing, but it means that the statements are eval-able in either Node.js or in a browser. It's a fairly small thing, but it is something I've used on a few occasions. Using JavaScript syntax also means I didn't need to invent new rules for how things should be done, I could just follow a subset of existing rules.
FWIW, personally I'm usually using gron to help gain a better understanding of the structure of an object; often trying to find where a piece of known data exists, which means grepping for the value rather than the key/path - avoiding many of the problems you mention.
Thanks for your input :) I'd like to see your jq script to help me learn some more about jq!
With `fgrep` and quoteing in 's You don't have to escape anything.
What is your jq script?
#!/usr/bin/jq -rf tostream | select(length == 2) | ( ( [ .[0][] | tostring | gsub("[^\\w]"; "_") ] | join("_") ) + "=" + ( .[1] | tostring | @sh ) )
Can you share the jq script ?
Great with
While I like what jq let's me do, I actually find it really difficult to use. It's very rare that I attempt to use it without having to consult the docs, and when I try to do anything remotely complex it often takes ages to figure it out.
I very much like the look of gron for the simpler stuff!
"Or you could create a shell script in your $PATH named ungron or norg to affect all users"
You could also check argv[0] for if you were called via the `ungron` name. Then it would be as simple as a symlink, which is very easy to add at install/packaging time.
(I know it's fairly broadly known, but this is the "multicall binary" pattern:
https://flameeyes.blog/2009/10/19/multicall-binaries/
)
Please submit this as an issue to the repo.
Looks like someone has done so here:
https://github.com/tomnomnom/gron/issues/77
Yup, and it’s been merged :-)
If you'd like to use something like this in your own APIs to let your clients filter requests or on the CLI (as is the intention with gron), consider giving "json-mask" a try (you'll need Node.js installed):
$ echo '{"user": {"name": "Sam", "age": 40}}' | npx json-mask "user/age" {"user":{"age":40}}
or (from the first gron example; the results are identical)
$ gron "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | fgrep "commit.author" | gron --ungron $ curl "https://api.github.com/repos/tomnomnom/gron/commits?per_page=1" | npx json-mask "commit/author"
If you've ever used Google APIs' `fields=` query param you already know how to use json-mask; it's super simple:
a,b,c - comma-separated list will select multiple fields a/b/c - path will select a field from its parent a(b,c) - sub-selection will select many fields from a parent a/*/c - the star * wildcard will select all items in a field
So it basically flattens a json file into lines of flattened-key = value. Which makes it easy to just grep
Same as html2/xml2.
The author of tool did a really nice tutorial on doing bug bounty recon using Linux in which he also used Gron:
https://youtu.be/l8iXMgk2nnY?t=1335
This was on the front page before, which generated some good discussion:
https://news.ycombinator.com/item?id=16727665
Yeah, two years ago, and no commits (roughly) since.
(Has bug reports, has PRs, it's not 'done'.)
I'm sorry to say this is down to my (the author's) mental health issues over the last few years.
I hope to be able to face dealing with people's issues and PRs soon.
I'm sorry to hear that - I didn't mean it as a criticism, (you're free to work on or not work on whatever you want of course!) I was just surprised at the level of traction the post was getting I suppose.
All the best.
No problem at all! I'm as surprised as you are to be honest! Still makes me happy to see people getting use out of any of my tools though :)
Wow, I am typically hesitant to adopt new tools into my flow. Often times they either don't solve my problem all that much better, or they try to do too much.
This looks _perfect_. Does one thing and does it well. I will be adopting this :-)
Is this better than the old solution of json_pp < jsonfile | grep 'pattern' ?
While that's only useful for picking out specific named keys without context, that's often good enough to get the job done. Added bonus is that json_pp and grep are usually installed by default so you don't have to install anything.
This is a fantastic idea.
I have installed gron on all my development machines.
Will probably use it heavily when working with awscli. I'm not conversant enough in the jq query language to not have to look things up when writing even somewhat complex scripts. And I don't want to learn awscli's custom query syntax. :)
Thought at first that it might be possible to replicate gron's functionality by some magic composition of jq, xargs, and grep, but that was before I understood the full awesomeness of gron - piping through grep, sed maintains gron context so you can still ungron later.
Nice work, thank you!
1. Is there a name/"standard" for the format gron is transforming json into?
2. Thesis:
jq is cumbersome when used on a json input of serious size/complexity because upfront knowledge of the structure of the json is needed to formulate correct search queries.
Gron supports that "uninformed search" use-case much better.
Prove me wrong ;)
1. There isn't really a name for it, but it's a subset of JavaScript and the grammar is available here specified in EBNF, with some railroad diagrams to aid understanding:
https://tomnomnom.github.io/gron/
2. That's pretty much exactly why I wrote the tool :)
gron outputs Javascript!
I use this all the time when working with new APIs.
JSON Path Names at
https://www.convertjson.com/json-path-list.htm
will do this too plus break down all the pieces in a nice searchable table format. Disclosure - author.
There is also JMESPath that implements a proper spec.
https://gatling.io/2019/07/31/introducing-jmespath-support/
JMESPath has a spec, that is true, but JMESPath also has some serious limitations [1]. If I'm doing any JSON manipulation on the command line then I'll reach for jq.
That said, gron certainly looks like it offers simplicity in cases where using jq would be too complicated.
[1]
https://news.ycombinator.com/item?id=16400320
Catj is worth a mention. Similar, but written in node.js.
https://github.com/soheilpro/catj
Very nice! I don't like that it can also make network requests. It's potential security hole and completely unnecessary given that we already have curl and pipes for that.
My humble attempt at building a tool similar to this and jq but with saner DSL syntax
https://github.com/jsqry/jsqry-cli2
Assuming GRON is short for "grep json", it seems like the tool could have a better name. It looks like it is useful beyond just the grepping case.
See also:
https://github.com/micha/json-table
I Love it, I have never used jq without first googling how to do the thing I want to do. It does not do 1 thing well, it does many things not so well.
Gron is awesome, I use it frequently to quickly find jsonpath queries to kubernetes manifest properties.
Fantastic tool. Thanks for sharing.
Also fx is very handy for easily inspecting json and is one like using too.
https://github.com/antonmedv/fx
Oh, it's interactive. Perfect!
This is going to blow your mind but you can actually GREP the output of jq. I feel so bad for how long it took this man to write a language not knowing that.