Oliver Simmons oliversimmo at gmail.com
Wed Oct 13 22:53:13 BST 2021
- - - - - - - - - - - - - - - - - - -
On Wed, 13 Oct 2021 at 21:38, Chris McGowan <cmcgowan9990 at gmail.com> wrote:
Unfortunately, no. For example, take this line: `# Foo bar I'm a level-1 title`
A 3-char substring of that would yield "# F", which isn't useful.
In what way isn't it useful? It tells you literally everything you need
to know.
my $first3 = substr $line, 0, 3;
… simply taking a three character substring of the line should be enough …Seems there was a misunderstanding or something, apologies.I'll respond to the rest of the email with the assumption there wasn'ta misunderstanding.
# slightly magical regex, /g will return an array of matches,
# assigning back to a scalar gives us a count of matches
if ( my $level = $first3 =~ m/(#)+/g )
{
return "Level $level header";
[...]
```
Using RegEx this way is in effect doing the starts-with method Imentioned in my other email.
In fact, the substring isn't even necessary in this code
"In what way isn't it useful?", here's the answer.It's useful as a method for limiting counts of '#', but that couldalso be done with a `max(count, 3)` or you might not even be doingcounts.
but that's largely true for languages which have decent regex support.
A `.StartsWith(foo)` method works too.For doing the counting '#' part this way, you'd do something along the lines of:```C# codeif (line.StartWith('#')) { int count = line.SubString(0,3).Count(c =
c == '#').Length;OR int count = Math.Max(line.Count(c =
c == '#').Length, 3);```(Note: this code isn't tested, and there are faster ways to countchars than Count, it's used here for clarity)
If you weren't using one of those (i.e. C) or are for some reason
allergic to regexes you could simply index the string to determine the
line type (note: this would likely improve speed, but probably only a
imperceptibly small amount and likely wouldn't be worth it.)
Just to really drive home the point that this isn't a difficult task,
here's the version I wouldn't write unless I was using C (still in Perl
though):
```
# Note: split here is because perl doesn't allow direct subscripting of
# strings. In languages that do allow that, this other array is
# unnecessary and you could use $line directly.
my @first3 = split( "", substr( $line, 0, 3));
if ( $first3[0] eq '#' )
{
if ( $first3[1] eq '#' )
{
if ( $first3[2] eq '#' )
{
return "Level 3 header";
}
return "Level 2 header";
}
return "Level 1 header";
}
Certainly an interesting method for header lines.Since this is "Perl but with C tactics" code: does Perl have a switch statement?
In summary, I hardly think it's impossible or even difficult to
unambiguously parse gemtext without having a mandatory space.
I never claimed that it wasn't, just that it's easier to program.
Here's a simple parser using the split-on-first-space method:```C# code// Max split of three is used so this can be re-used later when doinglink-lines, no need to do more than that anywaysstring[] splitLine = line.Split(" ", 3)switch (splitLine[0].SubString(0,3)) { case "#": case "##": case "###": Console.PrintLine("Header level of {0} with text {1}",splitLine[0].Length, line[3..]); // Rather than counting the header length, you could also justuse the other `case` statements. break; case "=
": Console.PrintLine("Link line with link {0}, title {1}",splitLine[1], splitLine[2]); break; [… and so on …] default: Console.PrintLine("Normal or preformatted line"); break;}```
Thanks for your code examples,-Oliver Simmons (GoodClover)