spec.txt (195431B)
1 --- 2 title: CommonMark Spec 3 author: John MacFarlane 4 version: 0.28 5 date: '2017-08-01' 6 license: '[CC-BY-SA 4.0](http://creativecommons.org/licenses/by-sa/4.0/)' 7 ... 8 9 # Introduction 10 11 ## What is Markdown? 12 13 Markdown is a plain text format for writing structured documents, 14 based on conventions for indicating formatting in email 15 and usenet posts. It was developed by John Gruber (with 16 help from Aaron Swartz) and released in 2004 in the form of a 17 [syntax description](http://daringfireball.net/projects/markdown/syntax) 18 and a Perl script (`Markdown.pl`) for converting Markdown to 19 HTML. In the next decade, dozens of implementations were 20 developed in many languages. Some extended the original 21 Markdown syntax with conventions for footnotes, tables, and 22 other document elements. Some allowed Markdown documents to be 23 rendered in formats other than HTML. Websites like Reddit, 24 StackOverflow, and GitHub had millions of people using Markdown. 25 And Markdown started to be used beyond the web, to author books, 26 articles, slide shows, letters, and lecture notes. 27 28 What distinguishes Markdown from many other lightweight markup 29 syntaxes, which are often easier to write, is its readability. 30 As Gruber writes: 31 32 > The overriding design goal for Markdown's formatting syntax is 33 > to make it as readable as possible. The idea is that a 34 > Markdown-formatted document should be publishable as-is, as 35 > plain text, without looking like it's been marked up with tags 36 > or formatting instructions. 37 > (<http://daringfireball.net/projects/markdown/>) 38 39 The point can be illustrated by comparing a sample of 40 [AsciiDoc](http://www.methods.co.nz/asciidoc/) with 41 an equivalent sample of Markdown. Here is a sample of 42 AsciiDoc from the AsciiDoc manual: 43 44 ``` 45 1. List item one. 46 + 47 List item one continued with a second paragraph followed by an 48 Indented block. 49 + 50 ................. 51 $ ls *.sh 52 $ mv *.sh ~/tmp 53 ................. 54 + 55 List item continued with a third paragraph. 56 57 2. List item two continued with an open block. 58 + 59 -- 60 This paragraph is part of the preceding list item. 61 62 a. This list is nested and does not require explicit item 63 continuation. 64 + 65 This paragraph is part of the preceding list item. 66 67 b. List item b. 68 69 This paragraph belongs to item two of the outer list. 70 -- 71 ``` 72 73 And here is the equivalent in Markdown: 74 ``` 75 1. List item one. 76 77 List item one continued with a second paragraph followed by an 78 Indented block. 79 80 $ ls *.sh 81 $ mv *.sh ~/tmp 82 83 List item continued with a third paragraph. 84 85 2. List item two continued with an open block. 86 87 This paragraph is part of the preceding list item. 88 89 1. This list is nested and does not require explicit item continuation. 90 91 This paragraph is part of the preceding list item. 92 93 2. List item b. 94 95 This paragraph belongs to item two of the outer list. 96 ``` 97 98 The AsciiDoc version is, arguably, easier to write. You don't need 99 to worry about indentation. But the Markdown version is much easier 100 to read. The nesting of list items is apparent to the eye in the 101 source, not just in the processed document. 102 103 ## Why is a spec needed? 104 105 John Gruber's [canonical description of Markdown's 106 syntax](http://daringfireball.net/projects/markdown/syntax) 107 does not specify the syntax unambiguously. Here are some examples of 108 questions it does not answer: 109 110 1. How much indentation is needed for a sublist? The spec says that 111 continuation paragraphs need to be indented four spaces, but is 112 not fully explicit about sublists. It is natural to think that 113 they, too, must be indented four spaces, but `Markdown.pl` does 114 not require that. This is hardly a "corner case," and divergences 115 between implementations on this issue often lead to surprises for 116 users in real documents. (See [this comment by John 117 Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).) 118 119 2. Is a blank line needed before a block quote or heading? 120 Most implementations do not require the blank line. However, 121 this can lead to unexpected results in hard-wrapped text, and 122 also to ambiguities in parsing (note that some implementations 123 put the heading inside the blockquote, while others do not). 124 (John Gruber has also spoken [in favor of requiring the blank 125 lines](http://article.gmane.org/gmane.text.markdown.general/2146).) 126 127 3. Is a blank line needed before an indented code block? 128 (`Markdown.pl` requires it, but this is not mentioned in the 129 documentation, and some implementations do not require it.) 130 131 ``` markdown 132 paragraph 133 code? 134 ``` 135 136 4. What is the exact rule for determining when list items get 137 wrapped in `<p>` tags? Can a list be partially "loose" and partially 138 "tight"? What should we do with a list like this? 139 140 ``` markdown 141 1. one 142 143 2. two 144 3. three 145 ``` 146 147 Or this? 148 149 ``` markdown 150 1. one 151 - a 152 153 - b 154 2. two 155 ``` 156 157 (There are some relevant comments by John Gruber 158 [here](http://article.gmane.org/gmane.text.markdown.general/2554).) 159 160 5. Can list markers be indented? Can ordered list markers be right-aligned? 161 162 ``` markdown 163 8. item 1 164 9. item 2 165 10. item 2a 166 ``` 167 168 6. Is this one list with a thematic break in its second item, 169 or two lists separated by a thematic break? 170 171 ``` markdown 172 * a 173 * * * * * 174 * b 175 ``` 176 177 7. When list markers change from numbers to bullets, do we have 178 two lists or one? (The Markdown syntax description suggests two, 179 but the perl scripts and many other implementations produce one.) 180 181 ``` markdown 182 1. fee 183 2. fie 184 - foe 185 - fum 186 ``` 187 188 8. What are the precedence rules for the markers of inline structure? 189 For example, is the following a valid link, or does the code span 190 take precedence ? 191 192 ``` markdown 193 [a backtick (`)](/url) and [another backtick (`)](/url). 194 ``` 195 196 9. What are the precedence rules for markers of emphasis and strong 197 emphasis? For example, how should the following be parsed? 198 199 ``` markdown 200 *foo *bar* baz* 201 ``` 202 203 10. What are the precedence rules between block-level and inline-level 204 structure? For example, how should the following be parsed? 205 206 ``` markdown 207 - `a long code span can contain a hyphen like this 208 - and it can screw things up` 209 ``` 210 211 11. Can list items include section headings? (`Markdown.pl` does not 212 allow this, but does allow blockquotes to include headings.) 213 214 ``` markdown 215 - # Heading 216 ``` 217 218 12. Can list items be empty? 219 220 ``` markdown 221 * a 222 * 223 * b 224 ``` 225 226 13. Can link references be defined inside block quotes or list items? 227 228 ``` markdown 229 > Blockquote [foo]. 230 > 231 > [foo]: /url 232 ``` 233 234 14. If there are multiple definitions for the same reference, which takes 235 precedence? 236 237 ``` markdown 238 [foo]: /url1 239 [foo]: /url2 240 241 [foo][] 242 ``` 243 244 In the absence of a spec, early implementers consulted `Markdown.pl` 245 to resolve these ambiguities. But `Markdown.pl` was quite buggy, and 246 gave manifestly bad results in many cases, so it was not a 247 satisfactory replacement for a spec. 248 249 Because there is no unambiguous spec, implementations have diverged 250 considerably. As a result, users are often surprised to find that 251 a document that renders one way on one system (say, a github wiki) 252 renders differently on another (say, converting to docbook using 253 pandoc). To make matters worse, because nothing in Markdown counts 254 as a "syntax error," the divergence often isn't discovered right away. 255 256 ## About this document 257 258 This document attempts to specify Markdown syntax unambiguously. 259 It contains many examples with side-by-side Markdown and 260 HTML. These are intended to double as conformance tests. An 261 accompanying script `spec_tests.py` can be used to run the tests 262 against any Markdown program: 263 264 python test/spec_tests.py --spec spec.txt --program PROGRAM 265 266 Since this document describes how Markdown is to be parsed into 267 an abstract syntax tree, it would have made sense to use an abstract 268 representation of the syntax tree instead of HTML. But HTML is capable 269 of representing the structural distinctions we need to make, and the 270 choice of HTML for the tests makes it possible to run the tests against 271 an implementation without writing an abstract syntax tree renderer. 272 273 This document is generated from a text file, `spec.txt`, written 274 in Markdown with a small extension for the side-by-side tests. 275 The script `tools/makespec.py` can be used to convert `spec.txt` into 276 HTML or CommonMark (which can then be converted into other formats). 277 278 In the examples, the `→` character is used to represent tabs. 279 280 # Preliminaries 281 282 ## Characters and lines 283 284 Any sequence of [characters] is a valid CommonMark 285 document. 286 287 A [character](@) is a Unicode code point. Although some 288 code points (for example, combining accents) do not correspond to 289 characters in an intuitive sense, all code points count as characters 290 for purposes of this spec. 291 292 This spec does not specify an encoding; it thinks of lines as composed 293 of [characters] rather than bytes. A conforming parser may be limited 294 to a certain encoding. 295 296 A [line](@) is a sequence of zero or more [characters] 297 other than newline (`U+000A`) or carriage return (`U+000D`), 298 followed by a [line ending] or by the end of file. 299 300 A [line ending](@) is a newline (`U+000A`), a carriage return 301 (`U+000D`) not followed by a newline, or a carriage return and a 302 following newline. 303 304 A line containing no characters, or a line containing only spaces 305 (`U+0020`) or tabs (`U+0009`), is called a [blank line](@). 306 307 The following definitions of character classes will be used in this spec: 308 309 A [whitespace character](@) is a space 310 (`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`), 311 form feed (`U+000C`), or carriage return (`U+000D`). 312 313 [Whitespace](@) is a sequence of one or more [whitespace 314 characters]. 315 316 A [Unicode whitespace character](@) is 317 any code point in the Unicode `Zs` general category, or a tab (`U+0009`), 318 carriage return (`U+000D`), newline (`U+000A`), or form feed 319 (`U+000C`). 320 321 [Unicode whitespace](@) is a sequence of one 322 or more [Unicode whitespace characters]. 323 324 A [space](@) is `U+0020`. 325 326 A [non-whitespace character](@) is any character 327 that is not a [whitespace character]. 328 329 An [ASCII punctuation character](@) 330 is `!`, `"`, `#`, `gemini - kennedy.gemi.dev , `%`, `&`, `'`, `(`, `)`, 331 `*`, `+`, `,`, `-`, `.`, `/`, `:`, `;`, `<`, `=`, `>`, `?`, `@`, 332 `[`, `\`, `]`, `^`, `_`, `` ` ``, `{`, `|`, `}`, or `~`. 333 334 A [punctuation character](@) is an [ASCII 335 punctuation character] or anything in 336 the general Unicode categories `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`. 337 338 ## Tabs 339 340 Tabs in lines are not expanded to [spaces]. However, 341 in contexts where whitespace helps to define block structure, 342 tabs behave as if they were replaced by spaces with a tab stop 343 of 4 characters. 344 345 Thus, for example, a tab can be used instead of four spaces 346 in an indented code block. (Note, however, that internal 347 tabs are passed through as literal tabs, not expanded to 348 spaces.) 349 350 ```````````````````````````````` example 351 →foo→baz→→bim 352 . 353 <pre><code>foo→baz→→bim 354 </code></pre> 355 ```````````````````````````````` 356 357 ```````````````````````````````` example 358 →foo→baz→→bim 359 . 360 <pre><code>foo→baz→→bim 361 </code></pre> 362 ```````````````````````````````` 363 364 ```````````````````````````````` example 365 a→a 366 ὐ→a 367 . 368 <pre><code>a→a 369 ὐ→a 370 </code></pre> 371 ```````````````````````````````` 372 373 In the following example, a continuation paragraph of a list 374 item is indented with a tab; this has exactly the same effect 375 as indentation with four spaces would: 376 377 ```````````````````````````````` example 378 - foo 379 380 →bar 381 . 382 <ul> 383 <li> 384 <p>foo</p> 385 <p>bar</p> 386 </li> 387 </ul> 388 ```````````````````````````````` 389 390 ```````````````````````````````` example 391 - foo 392 393 →→bar 394 . 395 <ul> 396 <li> 397 <p>foo</p> 398 <pre><code> bar 399 </code></pre> 400 </li> 401 </ul> 402 ```````````````````````````````` 403 404 Normally the `>` that begins a block quote may be followed 405 optionally by a space, which is not considered part of the 406 content. In the following case `>` is followed by a tab, 407 which is treated as if it were expanded into three spaces. 408 Since one of these spaces is considered part of the 409 delimiter, `foo` is considered to be indented six spaces 410 inside the block quote context, so we get an indented 411 code block starting with two spaces. 412 413 ```````````````````````````````` example 414 >→→foo 415 . 416 <blockquote> 417 <pre><code> foo 418 </code></pre> 419 </blockquote> 420 ```````````````````````````````` 421 422 ```````````````````````````````` example 423 -→→foo 424 . 425 <ul> 426 <li> 427 <pre><code> foo 428 </code></pre> 429 </li> 430 </ul> 431 ```````````````````````````````` 432 433 434 ```````````````````````````````` example 435 foo 436 →bar 437 . 438 <pre><code>foo 439 bar 440 </code></pre> 441 ```````````````````````````````` 442 443 ```````````````````````````````` example 444 - foo 445 - bar 446 → - baz 447 . 448 <ul> 449 <li>foo 450 <ul> 451 <li>bar 452 <ul> 453 <li>baz</li> 454 </ul> 455 </li> 456 </ul> 457 </li> 458 </ul> 459 ```````````````````````````````` 460 461 ```````````````````````````````` example 462 #→Foo 463 . 464 <h1>Foo</h1> 465 ```````````````````````````````` 466 467 ```````````````````````````````` example 468 *→*→*→ 469 . 470 <hr /> 471 ```````````````````````````````` 472 473 474 ## Insecure characters 475 476 For security reasons, the Unicode character `U+0000` must be replaced 477 with the REPLACEMENT CHARACTER (`U+FFFD`). 478 479 # Blocks and inlines 480 481 We can think of a document as a sequence of 482 [blocks](@)---structural elements like paragraphs, block 483 quotations, lists, headings, rules, and code blocks. Some blocks (like 484 block quotes and list items) contain other blocks; others (like 485 headings and paragraphs) contain [inline](@) content---text, 486 links, emphasized text, images, code spans, and so on. 487 488 ## Precedence 489 490 Indicators of block structure always take precedence over indicators 491 of inline structure. So, for example, the following is a list with 492 two items, not a list with one item containing a code span: 493 494 ```````````````````````````````` example 495 - `one 496 - two` 497 . 498 <ul> 499 <li>`one</li> 500 <li>two`</li> 501 </ul> 502 ```````````````````````````````` 503 504 505 This means that parsing can proceed in two steps: first, the block 506 structure of the document can be discerned; second, text lines inside 507 paragraphs, headings, and other block constructs can be parsed for inline 508 structure. The second step requires information about link reference 509 definitions that will be available only at the end of the first 510 step. Note that the first step requires processing lines in sequence, 511 but the second can be parallelized, since the inline parsing of 512 one block element does not affect the inline parsing of any other. 513 514 ## Container blocks and leaf blocks 515 516 We can divide blocks into two types: 517 [container block](@)s, 518 which can contain other blocks, and [leaf block](@)s, 519 which cannot. 520 521 # Leaf blocks 522 523 This section describes the different kinds of leaf block that make up a 524 Markdown document. 525 526 ## Thematic breaks 527 528 A line consisting of 0-3 spaces of indentation, followed by a sequence 529 of three or more matching `-`, `_`, or `*` characters, each followed 530 optionally by any number of spaces, forms a 531 [thematic break](@). 532 533 ```````````````````````````````` example 534 *** 535 --- 536 ___ 537 . 538 <hr /> 539 <hr /> 540 <hr /> 541 ```````````````````````````````` 542 543 544 Wrong characters: 545 546 ```````````````````````````````` example 547 +++ 548 . 549 <p>+++</p> 550 ```````````````````````````````` 551 552 553 ```````````````````````````````` example 554 === 555 . 556 <p>===</p> 557 ```````````````````````````````` 558 559 560 Not enough characters: 561 562 ```````````````````````````````` example 563 -- 564 ** 565 __ 566 . 567 <p>-- 568 ** 569 __</p> 570 ```````````````````````````````` 571 572 573 One to three spaces indent are allowed: 574 575 ```````````````````````````````` example 576 *** 577 *** 578 *** 579 . 580 <hr /> 581 <hr /> 582 <hr /> 583 ```````````````````````````````` 584 585 586 Four spaces is too many: 587 588 ```````````````````````````````` example 589 *** 590 . 591 <pre><code>*** 592 </code></pre> 593 ```````````````````````````````` 594 595 596 ```````````````````````````````` example 597 Foo 598 *** 599 . 600 <p>Foo 601 ***</p> 602 ```````````````````````````````` 603 604 605 More than three characters may be used: 606 607 ```````````````````````````````` example 608 _____________________________________ 609 . 610 <hr /> 611 ```````````````````````````````` 612 613 614 Spaces are allowed between the characters: 615 616 ```````````````````````````````` example 617 - - - 618 . 619 <hr /> 620 ```````````````````````````````` 621 622 623 ```````````````````````````````` example 624 ** * ** * ** * ** 625 . 626 <hr /> 627 ```````````````````````````````` 628 629 630 ```````````````````````````````` example 631 - - - - 632 . 633 <hr /> 634 ```````````````````````````````` 635 636 637 Spaces are allowed at the end: 638 639 ```````````````````````````````` example 640 - - - - 641 . 642 <hr /> 643 ```````````````````````````````` 644 645 646 However, no other characters may occur in the line: 647 648 ```````````````````````````````` example 649 _ _ _ _ a 650 651 a------ 652 653 ---a--- 654 . 655 <p>_ _ _ _ a</p> 656 <p>a------</p> 657 <p>---a---</p> 658 ```````````````````````````````` 659 660 661 It is required that all of the [non-whitespace characters] be the same. 662 So, this is not a thematic break: 663 664 ```````````````````````````````` example 665 *-* 666 . 667 <p><em>-</em></p> 668 ```````````````````````````````` 669 670 671 Thematic breaks do not need blank lines before or after: 672 673 ```````````````````````````````` example 674 - foo 675 *** 676 - bar 677 . 678 <ul> 679 <li>foo</li> 680 </ul> 681 <hr /> 682 <ul> 683 <li>bar</li> 684 </ul> 685 ```````````````````````````````` 686 687 688 Thematic breaks can interrupt a paragraph: 689 690 ```````````````````````````````` example 691 Foo 692 *** 693 bar 694 . 695 <p>Foo</p> 696 <hr /> 697 <p>bar</p> 698 ```````````````````````````````` 699 700 701 If a line of dashes that meets the above conditions for being a 702 thematic break could also be interpreted as the underline of a [setext 703 heading], the interpretation as a 704 [setext heading] takes precedence. Thus, for example, 705 this is a setext heading, not a paragraph followed by a thematic break: 706 707 ```````````````````````````````` example 708 Foo 709 --- 710 bar 711 . 712 <h2>Foo</h2> 713 <p>bar</p> 714 ```````````````````````````````` 715 716 717 When both a thematic break and a list item are possible 718 interpretations of a line, the thematic break takes precedence: 719 720 ```````````````````````````````` example 721 * Foo 722 * * * 723 * Bar 724 . 725 <ul> 726 <li>Foo</li> 727 </ul> 728 <hr /> 729 <ul> 730 <li>Bar</li> 731 </ul> 732 ```````````````````````````````` 733 734 735 If you want a thematic break in a list item, use a different bullet: 736 737 ```````````````````````````````` example 738 - Foo 739 - * * * 740 . 741 <ul> 742 <li>Foo</li> 743 <li> 744 <hr /> 745 </li> 746 </ul> 747 ```````````````````````````````` 748 749 750 ## ATX headings 751 752 An [ATX heading](@) 753 consists of a string of characters, parsed as inline content, between an 754 opening sequence of 1--6 unescaped `#` characters and an optional 755 closing sequence of any number of unescaped `#` characters. 756 The opening sequence of `#` characters must be followed by a 757 [space] or by the end of line. The optional closing sequence of `#`s must be 758 preceded by a [space] and may be followed by spaces only. The opening 759 `#` character may be indented 0-3 spaces. The raw contents of the 760 heading are stripped of leading and trailing spaces before being parsed 761 as inline content. The heading level is equal to the number of `#` 762 characters in the opening sequence. 763 764 Simple headings: 765 766 ```````````````````````````````` example 767 # foo 768 ## foo 769 ### foo 770 #### foo 771 ##### foo 772 ###### foo 773 . 774 <h1>foo</h1> 775 <h2>foo</h2> 776 <h3>foo</h3> 777 <h4>foo</h4> 778 <h5>foo</h5> 779 <h6>foo</h6> 780 ```````````````````````````````` 781 782 783 More than six `#` characters is not a heading: 784 785 ```````````````````````````````` example 786 ####### foo 787 . 788 <p>####### foo</p> 789 ```````````````````````````````` 790 791 792 At least one space is required between the `#` characters and the 793 heading's contents, unless the heading is empty. Note that many 794 implementations currently do not require the space. However, the 795 space was required by the 796 [original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py), 797 and it helps prevent things like the following from being parsed as 798 headings: 799 800 ```````````````````````````````` example 801 #5 bolt 802 803 #hashtag 804 . 805 <p>#5 bolt</p> 806 <p>#hashtag</p> 807 ```````````````````````````````` 808 809 810 This is not a heading, because the first `#` is escaped: 811 812 ```````````````````````````````` example 813 \## foo 814 . 815 <p>## foo</p> 816 ```````````````````````````````` 817 818 819 Contents are parsed as inlines: 820 821 ```````````````````````````````` example 822 # foo *bar* \*baz\* 823 . 824 <h1>foo <em>bar</em> *baz*</h1> 825 ```````````````````````````````` 826 827 828 Leading and trailing blanks are ignored in parsing inline content: 829 830 ```````````````````````````````` example 831 # foo 832 . 833 <h1>foo</h1> 834 ```````````````````````````````` 835 836 837 One to three spaces indentation are allowed: 838 839 ```````````````````````````````` example 840 ### foo 841 ## foo 842 # foo 843 . 844 <h3>foo</h3> 845 <h2>foo</h2> 846 <h1>foo</h1> 847 ```````````````````````````````` 848 849 850 Four spaces are too much: 851 852 ```````````````````````````````` example 853 # foo 854 . 855 <pre><code># foo 856 </code></pre> 857 ```````````````````````````````` 858 859 860 ```````````````````````````````` example 861 foo 862 # bar 863 . 864 <p>foo 865 # bar</p> 866 ```````````````````````````````` 867 868 869 A closing sequence of `#` characters is optional: 870 871 ```````````````````````````````` example 872 ## foo ## 873 ### bar ### 874 . 875 <h2>foo</h2> 876 <h3>bar</h3> 877 ```````````````````````````````` 878 879 880 It need not be the same length as the opening sequence: 881 882 ```````````````````````````````` example 883 # foo ################################## 884 ##### foo ## 885 . 886 <h1>foo</h1> 887 <h5>foo</h5> 888 ```````````````````````````````` 889 890 891 Spaces are allowed after the closing sequence: 892 893 ```````````````````````````````` example 894 ### foo ### 895 . 896 <h3>foo</h3> 897 ```````````````````````````````` 898 899 900 A sequence of `#` characters with anything but [spaces] following it 901 is not a closing sequence, but counts as part of the contents of the 902 heading: 903 904 ```````````````````````````````` example 905 ### foo ### b 906 . 907 <h3>foo ### b</h3> 908 ```````````````````````````````` 909 910 911 The closing sequence must be preceded by a space: 912 913 ```````````````````````````````` example 914 # foo# 915 . 916 <h1>foo#</h1> 917 ```````````````````````````````` 918 919 920 Backslash-escaped `#` characters do not count as part 921 of the closing sequence: 922 923 ```````````````````````````````` example 924 ### foo \### 925 ## foo #\## 926 # foo \# 927 . 928 <h3>foo ###</h3> 929 <h2>foo ###</h2> 930 <h1>foo #</h1> 931 ```````````````````````````````` 932 933 934 ATX headings need not be separated from surrounding content by blank 935 lines, and they can interrupt paragraphs: 936 937 ```````````````````````````````` example 938 **** 939 ## foo 940 **** 941 . 942 <hr /> 943 <h2>foo</h2> 944 <hr /> 945 ```````````````````````````````` 946 947 948 ```````````````````````````````` example 949 Foo bar 950 # baz 951 Bar foo 952 . 953 <p>Foo bar</p> 954 <h1>baz</h1> 955 <p>Bar foo</p> 956 ```````````````````````````````` 957 958 959 ATX headings can be empty: 960 961 ```````````````````````````````` example 962 ## 963 # 964 ### ### 965 . 966 <h2></h2> 967 <h1></h1> 968 <h3></h3> 969 ```````````````````````````````` 970 971 972 ## Setext headings 973 974 A [setext heading](@) consists of one or more 975 lines of text, each containing at least one [non-whitespace 976 character], with no more than 3 spaces indentation, followed by 977 a [setext heading underline]. The lines of text must be such 978 that, were they not followed by the setext heading underline, 979 they would be interpreted as a paragraph: they cannot be 980 interpretable as a [code fence], [ATX heading][ATX headings], 981 [block quote][block quotes], [thematic break][thematic breaks], 982 [list item][list items], or [HTML block][HTML blocks]. 983 984 A [setext heading underline](@) is a sequence of 985 `=` characters or a sequence of `-` characters, with no more than 3 986 spaces indentation and any number of trailing spaces. If a line 987 containing a single `-` can be interpreted as an 988 empty [list items], it should be interpreted this way 989 and not as a [setext heading underline]. 990 991 The heading is a level 1 heading if `=` characters are used in 992 the [setext heading underline], and a level 2 heading if `-` 993 characters are used. The contents of the heading are the result 994 of parsing the preceding lines of text as CommonMark inline 995 content. 996 997 In general, a setext heading need not be preceded or followed by a 998 blank line. However, it cannot interrupt a paragraph, so when a 999 setext heading comes after a paragraph, a blank line is needed between 1000 them. 1001 1002 Simple examples: 1003 1004 ```````````````````````````````` example 1005 Foo *bar* 1006 ========= 1007 1008 Foo *bar* 1009 --------- 1010 . 1011 <h1>Foo <em>bar</em></h1> 1012 <h2>Foo <em>bar</em></h2> 1013 ```````````````````````````````` 1014 1015 1016 The content of the header may span more than one line: 1017 1018 ```````````````````````````````` example 1019 Foo *bar 1020 baz* 1021 ==== 1022 . 1023 <h1>Foo <em>bar 1024 baz</em></h1> 1025 ```````````````````````````````` 1026 1027 1028 The underlining can be any length: 1029 1030 ```````````````````````````````` example 1031 Foo 1032 ------------------------- 1033 1034 Foo 1035 = 1036 . 1037 <h2>Foo</h2> 1038 <h1>Foo</h1> 1039 ```````````````````````````````` 1040 1041 1042 The heading content can be indented up to three spaces, and need 1043 not line up with the underlining: 1044 1045 ```````````````````````````````` example 1046 Foo 1047 --- 1048 1049 Foo 1050 ----- 1051 1052 Foo 1053 === 1054 . 1055 <h2>Foo</h2> 1056 <h2>Foo</h2> 1057 <h1>Foo</h1> 1058 ```````````````````````````````` 1059 1060 1061 Four spaces indent is too much: 1062 1063 ```````````````````````````````` example 1064 Foo 1065 --- 1066 1067 Foo 1068 --- 1069 . 1070 <pre><code>Foo 1071 --- 1072 1073 Foo 1074 </code></pre> 1075 <hr /> 1076 ```````````````````````````````` 1077 1078 1079 The setext heading underline can be indented up to three spaces, and 1080 may have trailing spaces: 1081 1082 ```````````````````````````````` example 1083 Foo 1084 ---- 1085 . 1086 <h2>Foo</h2> 1087 ```````````````````````````````` 1088 1089 1090 Four spaces is too much: 1091 1092 ```````````````````````````````` example 1093 Foo 1094 --- 1095 . 1096 <p>Foo 1097 ---</p> 1098 ```````````````````````````````` 1099 1100 1101 The setext heading underline cannot contain internal spaces: 1102 1103 ```````````````````````````````` example 1104 Foo 1105 = = 1106 1107 Foo 1108 --- - 1109 . 1110 <p>Foo 1111 = =</p> 1112 <p>Foo</p> 1113 <hr /> 1114 ```````````````````````````````` 1115 1116 1117 Trailing spaces in the content line do not cause a line break: 1118 1119 ```````````````````````````````` example 1120 Foo 1121 ----- 1122 . 1123 <h2>Foo</h2> 1124 ```````````````````````````````` 1125 1126 1127 Nor does a backslash at the end: 1128 1129 ```````````````````````````````` example 1130 Foo\ 1131 ---- 1132 . 1133 <h2>Foo\</h2> 1134 ```````````````````````````````` 1135 1136 1137 Since indicators of block structure take precedence over 1138 indicators of inline structure, the following are setext headings: 1139 1140 ```````````````````````````````` example 1141 `Foo 1142 ---- 1143 ` 1144 1145 <a title="a lot 1146 --- 1147 of dashes"/> 1148 . 1149 <h2>`Foo</h2> 1150 <p>`</p> 1151 <h2><a title="a lot</h2> 1152 <p>of dashes"/></p> 1153 ```````````````````````````````` 1154 1155 1156 The setext heading underline cannot be a [lazy continuation 1157 line] in a list item or block quote: 1158 1159 ```````````````````````````````` example 1160 > Foo 1161 --- 1162 . 1163 <blockquote> 1164 <p>Foo</p> 1165 </blockquote> 1166 <hr /> 1167 ```````````````````````````````` 1168 1169 1170 ```````````````````````````````` example 1171 > foo 1172 bar 1173 === 1174 . 1175 <blockquote> 1176 <p>foo 1177 bar 1178 ===</p> 1179 </blockquote> 1180 ```````````````````````````````` 1181 1182 1183 ```````````````````````````````` example 1184 - Foo 1185 --- 1186 . 1187 <ul> 1188 <li>Foo</li> 1189 </ul> 1190 <hr /> 1191 ```````````````````````````````` 1192 1193 1194 A blank line is needed between a paragraph and a following 1195 setext heading, since otherwise the paragraph becomes part 1196 of the heading's content: 1197 1198 ```````````````````````````````` example 1199 Foo 1200 Bar 1201 --- 1202 . 1203 <h2>Foo 1204 Bar</h2> 1205 ```````````````````````````````` 1206 1207 1208 But in general a blank line is not required before or after 1209 setext headings: 1210 1211 ```````````````````````````````` example 1212 --- 1213 Foo 1214 --- 1215 Bar 1216 --- 1217 Baz 1218 . 1219 <hr /> 1220 <h2>Foo</h2> 1221 <h2>Bar</h2> 1222 <p>Baz</p> 1223 ```````````````````````````````` 1224 1225 1226 Setext headings cannot be empty: 1227 1228 ```````````````````````````````` example 1229 1230 ==== 1231 . 1232 <p>====</p> 1233 ```````````````````````````````` 1234 1235 1236 Setext heading text lines must not be interpretable as block 1237 constructs other than paragraphs. So, the line of dashes 1238 in these examples gets interpreted as a thematic break: 1239 1240 ```````````````````````````````` example 1241 --- 1242 --- 1243 . 1244 <hr /> 1245 <hr /> 1246 ```````````````````````````````` 1247 1248 1249 ```````````````````````````````` example 1250 - foo 1251 ----- 1252 . 1253 <ul> 1254 <li>foo</li> 1255 </ul> 1256 <hr /> 1257 ```````````````````````````````` 1258 1259 1260 ```````````````````````````````` example 1261 foo 1262 --- 1263 . 1264 <pre><code>foo 1265 </code></pre> 1266 <hr /> 1267 ```````````````````````````````` 1268 1269 1270 ```````````````````````````````` example 1271 > foo 1272 ----- 1273 . 1274 <blockquote> 1275 <p>foo</p> 1276 </blockquote> 1277 <hr /> 1278 ```````````````````````````````` 1279 1280 1281 If you want a heading with `> foo` as its literal text, you can 1282 use backslash escapes: 1283 1284 ```````````````````````````````` example 1285 \> foo 1286 ------ 1287 . 1288 <h2>> foo</h2> 1289 ```````````````````````````````` 1290 1291 1292 **Compatibility note:** Most existing Markdown implementations 1293 do not allow the text of setext headings to span multiple lines. 1294 But there is no consensus about how to interpret 1295 1296 ``` markdown 1297 Foo 1298 bar 1299 --- 1300 baz 1301 ``` 1302 1303 One can find four different interpretations: 1304 1305 1. paragraph "Foo", heading "bar", paragraph "baz" 1306 2. paragraph "Foo bar", thematic break, paragraph "baz" 1307 3. paragraph "Foo bar --- baz" 1308 4. heading "Foo bar", paragraph "baz" 1309 1310 We find interpretation 4 most natural, and interpretation 4 1311 increases the expressive power of CommonMark, by allowing 1312 multiline headings. Authors who want interpretation 1 can 1313 put a blank line after the first paragraph: 1314 1315 ```````````````````````````````` example 1316 Foo 1317 1318 bar 1319 --- 1320 baz 1321 . 1322 <p>Foo</p> 1323 <h2>bar</h2> 1324 <p>baz</p> 1325 ```````````````````````````````` 1326 1327 1328 Authors who want interpretation 2 can put blank lines around 1329 the thematic break, 1330 1331 ```````````````````````````````` example 1332 Foo 1333 bar 1334 1335 --- 1336 1337 baz 1338 . 1339 <p>Foo 1340 bar</p> 1341 <hr /> 1342 <p>baz</p> 1343 ```````````````````````````````` 1344 1345 1346 or use a thematic break that cannot count as a [setext heading 1347 underline], such as 1348 1349 ```````````````````````````````` example 1350 Foo 1351 bar 1352 * * * 1353 baz 1354 . 1355 <p>Foo 1356 bar</p> 1357 <hr /> 1358 <p>baz</p> 1359 ```````````````````````````````` 1360 1361 1362 Authors who want interpretation 3 can use backslash escapes: 1363 1364 ```````````````````````````````` example 1365 Foo 1366 bar 1367 \--- 1368 baz 1369 . 1370 <p>Foo 1371 bar 1372 --- 1373 baz</p> 1374 ```````````````````````````````` 1375 1376 1377 ## Indented code blocks 1378 1379 An [indented code block](@) is composed of one or more 1380 [indented chunks] separated by blank lines. 1381 An [indented chunk](@) is a sequence of non-blank lines, 1382 each indented four or more spaces. The contents of the code block are 1383 the literal contents of the lines, including trailing 1384 [line endings], minus four spaces of indentation. 1385 An indented code block has no [info string]. 1386 1387 An indented code block cannot interrupt a paragraph, so there must be 1388 a blank line between a paragraph and a following indented code block. 1389 (A blank line is not needed, however, between a code block and a following 1390 paragraph.) 1391 1392 ```````````````````````````````` example 1393 a simple 1394 indented code block 1395 . 1396 <pre><code>a simple 1397 indented code block 1398 </code></pre> 1399 ```````````````````````````````` 1400 1401 1402 If there is any ambiguity between an interpretation of indentation 1403 as a code block and as indicating that material belongs to a [list 1404 item][list items], the list item interpretation takes precedence: 1405 1406 ```````````````````````````````` example 1407 - foo 1408 1409 bar 1410 . 1411 <ul> 1412 <li> 1413 <p>foo</p> 1414 <p>bar</p> 1415 </li> 1416 </ul> 1417 ```````````````````````````````` 1418 1419 1420 ```````````````````````````````` example 1421 1. foo 1422 1423 - bar 1424 . 1425 <ol> 1426 <li> 1427 <p>foo</p> 1428 <ul> 1429 <li>bar</li> 1430 </ul> 1431 </li> 1432 </ol> 1433 ```````````````````````````````` 1434 1435 1436 1437 The contents of a code block are literal text, and do not get parsed 1438 as Markdown: 1439 1440 ```````````````````````````````` example 1441 <a/> 1442 *hi* 1443 1444 - one 1445 . 1446 <pre><code><a/> 1447 *hi* 1448 1449 - one 1450 </code></pre> 1451 ```````````````````````````````` 1452 1453 1454 Here we have three chunks separated by blank lines: 1455 1456 ```````````````````````````````` example 1457 chunk1 1458 1459 chunk2 1460 1461 1462 1463 chunk3 1464 . 1465 <pre><code>chunk1 1466 1467 chunk2 1468 1469 1470 1471 chunk3 1472 </code></pre> 1473 ```````````````````````````````` 1474 1475 1476 Any initial spaces beyond four will be included in the content, even 1477 in interior blank lines: 1478 1479 ```````````````````````````````` example 1480 chunk1 1481 1482 chunk2 1483 . 1484 <pre><code>chunk1 1485 1486 chunk2 1487 </code></pre> 1488 ```````````````````````````````` 1489 1490 1491 An indented code block cannot interrupt a paragraph. (This 1492 allows hanging indents and the like.) 1493 1494 ```````````````````````````````` example 1495 Foo 1496 bar 1497 1498 . 1499 <p>Foo 1500 bar</p> 1501 ```````````````````````````````` 1502 1503 1504 However, any non-blank line with fewer than four leading spaces ends 1505 the code block immediately. So a paragraph may occur immediately 1506 after indented code: 1507 1508 ```````````````````````````````` example 1509 foo 1510 bar 1511 . 1512 <pre><code>foo 1513 </code></pre> 1514 <p>bar</p> 1515 ```````````````````````````````` 1516 1517 1518 And indented code can occur immediately before and after other kinds of 1519 blocks: 1520 1521 ```````````````````````````````` example 1522 # Heading 1523 foo 1524 Heading 1525 ------ 1526 foo 1527 ---- 1528 . 1529 <h1>Heading</h1> 1530 <pre><code>foo 1531 </code></pre> 1532 <h2>Heading</h2> 1533 <pre><code>foo 1534 </code></pre> 1535 <hr /> 1536 ```````````````````````````````` 1537 1538 1539 The first line can be indented more than four spaces: 1540 1541 ```````````````````````````````` example 1542 foo 1543 bar 1544 . 1545 <pre><code> foo 1546 bar 1547 </code></pre> 1548 ```````````````````````````````` 1549 1550 1551 Blank lines preceding or following an indented code block 1552 are not included in it: 1553 1554 ```````````````````````````````` example 1555 1556 1557 foo 1558 1559 1560 . 1561 <pre><code>foo 1562 </code></pre> 1563 ```````````````````````````````` 1564 1565 1566 Trailing spaces are included in the code block's content: 1567 1568 ```````````````````````````````` example 1569 foo 1570 . 1571 <pre><code>foo 1572 </code></pre> 1573 ```````````````````````````````` 1574 1575 1576 1577 ## Fenced code blocks 1578 1579 A [code fence](@) is a sequence 1580 of at least three consecutive backtick characters (`` ` ``) or 1581 tildes (`~`). (Tildes and backticks cannot be mixed.) 1582 A [fenced code block](@) 1583 begins with a code fence, indented no more than three spaces. 1584 1585 The line with the opening code fence may optionally contain some text 1586 following the code fence; this is trimmed of leading and trailing 1587 spaces and called the [info string](@). 1588 The [info string] may not contain any backtick 1589 characters. (The reason for this restriction is that otherwise 1590 some inline code would be incorrectly interpreted as the 1591 beginning of a fenced code block.) 1592 1593 The content of the code block consists of all subsequent lines, until 1594 a closing [code fence] of the same type as the code block 1595 began with (backticks or tildes), and with at least as many backticks 1596 or tildes as the opening code fence. If the leading code fence is 1597 indented N spaces, then up to N spaces of indentation are removed from 1598 each line of the content (if present). (If a content line is not 1599 indented, it is preserved unchanged. If it is indented less than N 1600 spaces, all of the indentation is removed.) 1601 1602 The closing code fence may be indented up to three spaces, and may be 1603 followed only by spaces, which are ignored. If the end of the 1604 containing block (or document) is reached and no closing code fence 1605 has been found, the code block contains all of the lines after the 1606 opening code fence until the end of the containing block (or 1607 document). (An alternative spec would require backtracking in the 1608 event that a closing code fence is not found. But this makes parsing 1609 much less efficient, and there seems to be no real down side to the 1610 behavior described here.) 1611 1612 A fenced code block may interrupt a paragraph, and does not require 1613 a blank line either before or after. 1614 1615 The content of a code fence is treated as literal text, not parsed 1616 as inlines. The first word of the [info string] is typically used to 1617 specify the language of the code sample, and rendered in the `class` 1618 attribute of the `code` tag. However, this spec does not mandate any 1619 particular treatment of the [info string]. 1620 1621 Here is a simple example with backticks: 1622 1623 ```````````````````````````````` example 1624 ``` 1625 < 1626 > 1627 ``` 1628 . 1629 <pre><code>< 1630 > 1631 </code></pre> 1632 ```````````````````````````````` 1633 1634 1635 With tildes: 1636 1637 ```````````````````````````````` example 1638 ~~~ 1639 < 1640 > 1641 ~~~ 1642 . 1643 <pre><code>< 1644 > 1645 </code></pre> 1646 ```````````````````````````````` 1647 1648 1649 The closing code fence must use the same character as the opening 1650 fence: 1651 1652 ```````````````````````````````` example 1653 ``` 1654 aaa 1655 ~~~ 1656 ``` 1657 . 1658 <pre><code>aaa 1659 ~~~ 1660 </code></pre> 1661 ```````````````````````````````` 1662 1663 1664 ```````````````````````````````` example 1665 ~~~ 1666 aaa 1667 ``` 1668 ~~~ 1669 . 1670 <pre><code>aaa 1671 ``` 1672 </code></pre> 1673 ```````````````````````````````` 1674 1675 1676 The closing code fence must be at least as long as the opening fence: 1677 1678 ```````````````````````````````` example 1679 ```` 1680 aaa 1681 ``` 1682 `````` 1683 . 1684 <pre><code>aaa 1685 ``` 1686 </code></pre> 1687 ```````````````````````````````` 1688 1689 1690 ```````````````````````````````` example 1691 ~~~~ 1692 aaa 1693 ~~~ 1694 ~~~~ 1695 . 1696 <pre><code>aaa 1697 ~~~ 1698 </code></pre> 1699 ```````````````````````````````` 1700 1701 1702 Unclosed code blocks are closed by the end of the document 1703 (or the enclosing [block quote][block quotes] or [list item][list items]): 1704 1705 ```````````````````````````````` example 1706 ``` 1707 . 1708 <pre><code></code></pre> 1709 ```````````````````````````````` 1710 1711 1712 ```````````````````````````````` example 1713 ````` 1714 1715 ``` 1716 aaa 1717 . 1718 <pre><code> 1719 ``` 1720 aaa 1721 </code></pre> 1722 ```````````````````````````````` 1723 1724 1725 ```````````````````````````````` example 1726 > ``` 1727 > aaa 1728 1729 bbb 1730 . 1731 <blockquote> 1732 <pre><code>aaa 1733 </code></pre> 1734 </blockquote> 1735 <p>bbb</p> 1736 ```````````````````````````````` 1737 1738 1739 A code block can have all empty lines as its content: 1740 1741 ```````````````````````````````` example 1742 ``` 1743 1744 1745 ``` 1746 . 1747 <pre><code> 1748 1749 </code></pre> 1750 ```````````````````````````````` 1751 1752 1753 A code block can be empty: 1754 1755 ```````````````````````````````` example 1756 ``` 1757 ``` 1758 . 1759 <pre><code></code></pre> 1760 ```````````````````````````````` 1761 1762 1763 Fences can be indented. If the opening fence is indented, 1764 content lines will have equivalent opening indentation removed, 1765 if present: 1766 1767 ```````````````````````````````` example 1768 ``` 1769 aaa 1770 aaa 1771 ``` 1772 . 1773 <pre><code>aaa 1774 aaa 1775 </code></pre> 1776 ```````````````````````````````` 1777 1778 1779 ```````````````````````````````` example 1780 ``` 1781 aaa 1782 aaa 1783 aaa 1784 ``` 1785 . 1786 <pre><code>aaa 1787 aaa 1788 aaa 1789 </code></pre> 1790 ```````````````````````````````` 1791 1792 1793 ```````````````````````````````` example 1794 ``` 1795 aaa 1796 aaa 1797 aaa 1798 ``` 1799 . 1800 <pre><code>aaa 1801 aaa 1802 aaa 1803 </code></pre> 1804 ```````````````````````````````` 1805 1806 1807 Four spaces indentation produces an indented code block: 1808 1809 ```````````````````````````````` example 1810 ``` 1811 aaa 1812 ``` 1813 . 1814 <pre><code>``` 1815 aaa 1816 ``` 1817 </code></pre> 1818 ```````````````````````````````` 1819 1820 1821 Closing fences may be indented by 0-3 spaces, and their indentation 1822 need not match that of the opening fence: 1823 1824 ```````````````````````````````` example 1825 ``` 1826 aaa 1827 ``` 1828 . 1829 <pre><code>aaa 1830 </code></pre> 1831 ```````````````````````````````` 1832 1833 1834 ```````````````````````````````` example 1835 ``` 1836 aaa 1837 ``` 1838 . 1839 <pre><code>aaa 1840 </code></pre> 1841 ```````````````````````````````` 1842 1843 1844 This is not a closing fence, because it is indented 4 spaces: 1845 1846 ```````````````````````````````` example 1847 ``` 1848 aaa 1849 ``` 1850 . 1851 <pre><code>aaa 1852 ``` 1853 </code></pre> 1854 ```````````````````````````````` 1855 1856 1857 1858 Code fences (opening and closing) cannot contain internal spaces: 1859 1860 ```````````````````````````````` example 1861 ``` ``` 1862 aaa 1863 . 1864 <p><code></code> 1865 aaa</p> 1866 ```````````````````````````````` 1867 1868 1869 ```````````````````````````````` example 1870 ~~~~~~ 1871 aaa 1872 ~~~ ~~ 1873 . 1874 <pre><code>aaa 1875 ~~~ ~~ 1876 </code></pre> 1877 ```````````````````````````````` 1878 1879 1880 Fenced code blocks can interrupt paragraphs, and can be followed 1881 directly by paragraphs, without a blank line between: 1882 1883 ```````````````````````````````` example 1884 foo 1885 ``` 1886 bar 1887 ``` 1888 baz 1889 . 1890 <p>foo</p> 1891 <pre><code>bar 1892 </code></pre> 1893 <p>baz</p> 1894 ```````````````````````````````` 1895 1896 1897 Other blocks can also occur before and after fenced code blocks 1898 without an intervening blank line: 1899 1900 ```````````````````````````````` example 1901 foo 1902 --- 1903 ~~~ 1904 bar 1905 ~~~ 1906 # baz 1907 . 1908 <h2>foo</h2> 1909 <pre><code>bar 1910 </code></pre> 1911 <h1>baz</h1> 1912 ```````````````````````````````` 1913 1914 1915 An [info string] can be provided after the opening code fence. 1916 Opening and closing spaces will be stripped, and the first word, prefixed 1917 with `language-`, is used as the value for the `class` attribute of the 1918 `code` element within the enclosing `pre` element. 1919 1920 ```````````````````````````````` example 1921 ```ruby 1922 def foo(x) 1923 return 3 1924 end 1925 ``` 1926 . 1927 <pre><code class="language-ruby">def foo(x) 1928 return 3 1929 end 1930 </code></pre> 1931 ```````````````````````````````` 1932 1933 1934 ```````````````````````````````` example 1935 ~~~~ ruby startline=3 $%@#$ 1936 def foo(x) 1937 return 3 1938 end 1939 ~~~~~~~ 1940 . 1941 <pre><code class="language-ruby">def foo(x) 1942 return 3 1943 end 1944 </code></pre> 1945 ```````````````````````````````` 1946 1947 1948 ```````````````````````````````` example 1949 ````; 1950 ```` 1951 . 1952 <pre><code class="language-;"></code></pre> 1953 ```````````````````````````````` 1954 1955 1956 [Info strings] for backtick code blocks cannot contain backticks: 1957 1958 ```````````````````````````````` example 1959 ``` aa ``` 1960 foo 1961 . 1962 <p><code>aa</code> 1963 foo</p> 1964 ```````````````````````````````` 1965 1966 1967 Closing code fences cannot have [info strings]: 1968 1969 ```````````````````````````````` example 1970 ``` 1971 ``` aaa 1972 ``` 1973 . 1974 <pre><code>``` aaa 1975 </code></pre> 1976 ```````````````````````````````` 1977 1978 1979 1980 ## HTML blocks 1981 1982 An [HTML block](@) is a group of lines that is treated 1983 as raw HTML (and will not be escaped in HTML output). 1984 1985 There are seven kinds of [HTML block], which can be defined 1986 by their start and end conditions. The block begins with a line that 1987 meets a [start condition](@) (after up to three spaces 1988 optional indentation). It ends with the first subsequent line that 1989 meets a matching [end condition](@), or the last line of 1990 the document or other [container block]), if no line is encountered that meets the 1991 [end condition]. If the first line meets both the [start condition] 1992 and the [end condition], the block will contain just that line. 1993 1994 1. **Start condition:** line begins with the string `<script`, 1995 `<pre`, or `<style` (case-insensitive), followed by whitespace, 1996 the string `>`, or the end of the line.\ 1997 **End condition:** line contains an end tag 1998 `</script>`, `</pre>`, or `</style>` (case-insensitive; it 1999 need not match the start tag). 2000 2001 2. **Start condition:** line begins with the string `<!--`.\ 2002 **End condition:** line contains the string `-->`. 2003 2004 3. **Start condition:** line begins with the string `<?`.\ 2005 **End condition:** line contains the string `?>`. 2006 2007 4. **Start condition:** line begins with the string `<!` 2008 followed by an uppercase ASCII letter.\ 2009 **End condition:** line contains the character `>`. 2010 2011 5. **Start condition:** line begins with the string 2012 `<![CDATA[`.\ 2013 **End condition:** line contains the string `]]>`. 2014 2015 6. **Start condition:** line begins the string `<` or `</` 2016 followed by one of the strings (case-insensitive) `address`, 2017 `article`, `aside`, `base`, `basefont`, `blockquote`, `body`, 2018 `caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`, 2019 `dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`, 2020 `footer`, `form`, `frame`, `frameset`, 2021 `h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`, 2022 `html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`, 2023 `meta`, `nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`, 2024 `section`, `source`, `summary`, `table`, `tbody`, `td`, 2025 `tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed 2026 by [whitespace], the end of the line, the string `>`, or 2027 the string `/>`.\ 2028 **End condition:** line is followed by a [blank line]. 2029 2030 7. **Start condition:** line begins with a complete [open tag] 2031 or [closing tag] (with any [tag name] other than `script`, 2032 `style`, or `pre`) followed only by [whitespace] 2033 or the end of the line.\ 2034 **End condition:** line is followed by a [blank line]. 2035 2036 All types of [HTML blocks] except type 7 may interrupt 2037 a paragraph. Blocks of type 7 may not interrupt a paragraph. 2038 (This restriction is intended to prevent unwanted interpretation 2039 of long tags inside a wrapped paragraph as starting HTML blocks.) 2040 2041 Some simple examples follow. Here are some basic HTML blocks 2042 of type 6: 2043 2044 ```````````````````````````````` example 2045 <table> 2046 <tr> 2047 <td> 2048 hi 2049 </td> 2050 </tr> 2051 </table> 2052 2053 okay. 2054 . 2055 <table> 2056 <tr> 2057 <td> 2058 hi 2059 </td> 2060 </tr> 2061 </table> 2062 <p>okay.</p> 2063 ```````````````````````````````` 2064 2065 2066 ```````````````````````````````` example 2067 <div> 2068 *hello* 2069 <foo><a> 2070 . 2071 <div> 2072 *hello* 2073 <foo><a> 2074 ```````````````````````````````` 2075 2076 2077 A block can also start with a closing tag: 2078 2079 ```````````````````````````````` example 2080 </div> 2081 *foo* 2082 . 2083 </div> 2084 *foo* 2085 ```````````````````````````````` 2086 2087 2088 Here we have two HTML blocks with a Markdown paragraph between them: 2089 2090 ```````````````````````````````` example 2091 <DIV CLASS="foo"> 2092 2093 *Markdown* 2094 2095 </DIV> 2096 . 2097 <DIV CLASS="foo"> 2098 <p><em>Markdown</em></p> 2099 </DIV> 2100 ```````````````````````````````` 2101 2102 2103 The tag on the first line can be partial, as long 2104 as it is split where there would be whitespace: 2105 2106 ```````````````````````````````` example 2107 <div id="foo" 2108 class="bar"> 2109 </div> 2110 . 2111 <div id="foo" 2112 class="bar"> 2113 </div> 2114 ```````````````````````````````` 2115 2116 2117 ```````````````````````````````` example 2118 <div id="foo" class="bar 2119 baz"> 2120 </div> 2121 . 2122 <div id="foo" class="bar 2123 baz"> 2124 </div> 2125 ```````````````````````````````` 2126 2127 2128 An open tag need not be closed: 2129 ```````````````````````````````` example 2130 <div> 2131 *foo* 2132 2133 *bar* 2134 . 2135 <div> 2136 *foo* 2137 <p><em>bar</em></p> 2138 ```````````````````````````````` 2139 2140 2141 2142 A partial tag need not even be completed (garbage 2143 in, garbage out): 2144 2145 ```````````````````````````````` example 2146 <div id="foo" 2147 *hi* 2148 . 2149 <div id="foo" 2150 *hi* 2151 ```````````````````````````````` 2152 2153 2154 ```````````````````````````````` example 2155 <div class 2156 foo 2157 . 2158 <div class 2159 foo 2160 ```````````````````````````````` 2161 2162 2163 The initial tag doesn't even need to be a valid 2164 tag, as long as it starts like one: 2165 2166 ```````````````````````````````` example 2167 <div *???-&&&-<--- 2168 *foo* 2169 . 2170 <div *???-&&&-<--- 2171 *foo* 2172 ```````````````````````````````` 2173 2174 2175 In type 6 blocks, the initial tag need not be on a line by 2176 itself: 2177 2178 ```````````````````````````````` example 2179 <div><a href="bar">*foo*</a></div> 2180 . 2181 <div><a href="bar">*foo*</a></div> 2182 ```````````````````````````````` 2183 2184 2185 ```````````````````````````````` example 2186 <table><tr><td> 2187 foo 2188 </td></tr></table> 2189 . 2190 <table><tr><td> 2191 foo 2192 </td></tr></table> 2193 ```````````````````````````````` 2194 2195 2196 Everything until the next blank line or end of document 2197 gets included in the HTML block. So, in the following 2198 example, what looks like a Markdown code block 2199 is actually part of the HTML block, which continues until a blank 2200 line or the end of the document is reached: 2201 2202 ```````````````````````````````` example 2203 <div></div> 2204 ``` c 2205 int x = 33; 2206 ``` 2207 . 2208 <div></div> 2209 ``` c 2210 int x = 33; 2211 ``` 2212 ```````````````````````````````` 2213 2214 2215 To start an [HTML block] with a tag that is *not* in the 2216 list of block-level tags in (6), you must put the tag by 2217 itself on the first line (and it must be complete): 2218 2219 ```````````````````````````````` example 2220 <a href="foo"> 2221 *bar* 2222 </a> 2223 . 2224 <a href="foo"> 2225 *bar* 2226 </a> 2227 ```````````````````````````````` 2228 2229 2230 In type 7 blocks, the [tag name] can be anything: 2231 2232 ```````````````````````````````` example 2233 <Warning> 2234 *bar* 2235 </Warning> 2236 . 2237 <Warning> 2238 *bar* 2239 </Warning> 2240 ```````````````````````````````` 2241 2242 2243 ```````````````````````````````` example 2244 <i class="foo"> 2245 *bar* 2246 </i> 2247 . 2248 <i class="foo"> 2249 *bar* 2250 </i> 2251 ```````````````````````````````` 2252 2253 2254 ```````````````````````````````` example 2255 </ins> 2256 *bar* 2257 . 2258 </ins> 2259 *bar* 2260 ```````````````````````````````` 2261 2262 2263 These rules are designed to allow us to work with tags that 2264 can function as either block-level or inline-level tags. 2265 The `<del>` tag is a nice example. We can surround content with 2266 `<del>` tags in three different ways. In this case, we get a raw 2267 HTML block, because the `<del>` tag is on a line by itself: 2268 2269 ```````````````````````````````` example 2270 <del> 2271 *foo* 2272 </del> 2273 . 2274 <del> 2275 *foo* 2276 </del> 2277 ```````````````````````````````` 2278 2279 2280 In this case, we get a raw HTML block that just includes 2281 the `<del>` tag (because it ends with the following blank 2282 line). So the contents get interpreted as CommonMark: 2283 2284 ```````````````````````````````` example 2285 <del> 2286 2287 *foo* 2288 2289 </del> 2290 . 2291 <del> 2292 <p><em>foo</em></p> 2293 </del> 2294 ```````````````````````````````` 2295 2296 2297 Finally, in this case, the `<del>` tags are interpreted 2298 as [raw HTML] *inside* the CommonMark paragraph. (Because 2299 the tag is not on a line by itself, we get inline HTML 2300 rather than an [HTML block].) 2301 2302 ```````````````````````````````` example 2303 <del>*foo*</del> 2304 . 2305 <p><del><em>foo</em></del></p> 2306 ```````````````````````````````` 2307 2308 2309 HTML tags designed to contain literal content 2310 (`script`, `style`, `pre`), comments, processing instructions, 2311 and declarations are treated somewhat differently. 2312 Instead of ending at the first blank line, these blocks 2313 end at the first line containing a corresponding end tag. 2314 As a result, these blocks can contain blank lines: 2315 2316 A pre tag (type 1): 2317 2318 ```````````````````````````````` example 2319 <pre language="haskell"><code> 2320 import Text.HTML.TagSoup 2321 2322 main :: IO () 2323 main = print $ parseTags tags 2324 </code></pre> 2325 okay 2326 . 2327 <pre language="haskell"><code> 2328 import Text.HTML.TagSoup 2329 2330 main :: IO () 2331 main = print $ parseTags tags 2332 </code></pre> 2333 <p>okay</p> 2334 ```````````````````````````````` 2335 2336 2337 A script tag (type 1): 2338 2339 ```````````````````````````````` example 2340 <script type="text/javascript"> 2341 // JavaScript example 2342 2343 document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2344 </script> 2345 okay 2346 . 2347 <script type="text/javascript"> 2348 // JavaScript example 2349 2350 document.getElementById("demo").innerHTML = "Hello JavaScript!"; 2351 </script> 2352 <p>okay</p> 2353 ```````````````````````````````` 2354 2355 2356 A style tag (type 1): 2357 2358 ```````````````````````````````` example 2359 <style 2360 type="text/css"> 2361 h1 {color:red;} 2362 2363 p {color:blue;} 2364 </style> 2365 okay 2366 . 2367 <style 2368 type="text/css"> 2369 h1 {color:red;} 2370 2371 p {color:blue;} 2372 </style> 2373 <p>okay</p> 2374 ```````````````````````````````` 2375 2376 2377 If there is no matching end tag, the block will end at the 2378 end of the document (or the enclosing [block quote][block quotes] 2379 or [list item][list items]): 2380 2381 ```````````````````````````````` example 2382 <style 2383 type="text/css"> 2384 2385 foo 2386 . 2387 <style 2388 type="text/css"> 2389 2390 foo 2391 ```````````````````````````````` 2392 2393 2394 ```````````````````````````````` example 2395 > <div> 2396 > foo 2397 2398 bar 2399 . 2400 <blockquote> 2401 <div> 2402 foo 2403 </blockquote> 2404 <p>bar</p> 2405 ```````````````````````````````` 2406 2407 2408 ```````````````````````````````` example 2409 - <div> 2410 - foo 2411 . 2412 <ul> 2413 <li> 2414 <div> 2415 </li> 2416 <li>foo</li> 2417 </ul> 2418 ```````````````````````````````` 2419 2420 2421 The end tag can occur on the same line as the start tag: 2422 2423 ```````````````````````````````` example 2424 <style>p{color:red;}</style> 2425 *foo* 2426 . 2427 <style>p{color:red;}</style> 2428 <p><em>foo</em></p> 2429 ```````````````````````````````` 2430 2431 2432 ```````````````````````````````` example 2433 <!-- foo -->*bar* 2434 *baz* 2435 . 2436 <!-- foo -->*bar* 2437 <p><em>baz</em></p> 2438 ```````````````````````````````` 2439 2440 2441 Note that anything on the last line after the 2442 end tag will be included in the [HTML block]: 2443 2444 ```````````````````````````````` example 2445 <script> 2446 foo 2447 </script>1. *bar* 2448 . 2449 <script> 2450 foo 2451 </script>1. *bar* 2452 ```````````````````````````````` 2453 2454 2455 A comment (type 2): 2456 2457 ```````````````````````````````` example 2458 <!-- Foo 2459 2460 bar 2461 baz --> 2462 okay 2463 . 2464 <!-- Foo 2465 2466 bar 2467 baz --> 2468 <p>okay</p> 2469 ```````````````````````````````` 2470 2471 2472 2473 A processing instruction (type 3): 2474 2475 ```````````````````````````````` example 2476 <?php 2477 2478 echo '>'; 2479 2480 ?> 2481 okay 2482 . 2483 <?php 2484 2485 echo '>'; 2486 2487 ?> 2488 <p>okay</p> 2489 ```````````````````````````````` 2490 2491 2492 A declaration (type 4): 2493 2494 ```````````````````````````````` example 2495 <!DOCTYPE html> 2496 . 2497 <!DOCTYPE html> 2498 ```````````````````````````````` 2499 2500 2501 CDATA (type 5): 2502 2503 ```````````````````````````````` example 2504 <![CDATA[ 2505 function matchwo(a,b) 2506 { 2507 if (a < b && a < 0) then { 2508 return 1; 2509 2510 } else { 2511 2512 return 0; 2513 } 2514 } 2515 ]]> 2516 okay 2517 . 2518 <![CDATA[ 2519 function matchwo(a,b) 2520 { 2521 if (a < b && a < 0) then { 2522 return 1; 2523 2524 } else { 2525 2526 return 0; 2527 } 2528 } 2529 ]]> 2530 <p>okay</p> 2531 ```````````````````````````````` 2532 2533 2534 The opening tag can be indented 1-3 spaces, but not 4: 2535 2536 ```````````````````````````````` example 2537 <!-- foo --> 2538 2539 <!-- foo --> 2540 . 2541 <!-- foo --> 2542 <pre><code><!-- foo --> 2543 </code></pre> 2544 ```````````````````````````````` 2545 2546 2547 ```````````````````````````````` example 2548 <div> 2549 2550 <div> 2551 . 2552 <div> 2553 <pre><code><div> 2554 </code></pre> 2555 ```````````````````````````````` 2556 2557 2558 An HTML block of types 1--6 can interrupt a paragraph, and need not be 2559 preceded by a blank line. 2560 2561 ```````````````````````````````` example 2562 Foo 2563 <div> 2564 bar 2565 </div> 2566 . 2567 <p>Foo</p> 2568 <div> 2569 bar 2570 </div> 2571 ```````````````````````````````` 2572 2573 2574 However, a following blank line is needed, except at the end of 2575 a document, and except for blocks of types 1--5, above: 2576 2577 ```````````````````````````````` example 2578 <div> 2579 bar 2580 </div> 2581 *foo* 2582 . 2583 <div> 2584 bar 2585 </div> 2586 *foo* 2587 ```````````````````````````````` 2588 2589 2590 HTML blocks of type 7 cannot interrupt a paragraph: 2591 2592 ```````````````````````````````` example 2593 Foo 2594 <a href="bar"> 2595 baz 2596 . 2597 <p>Foo 2598 <a href="bar"> 2599 baz</p> 2600 ```````````````````````````````` 2601 2602 2603 This rule differs from John Gruber's original Markdown syntax 2604 specification, which says: 2605 2606 > The only restrictions are that block-level HTML elements — 2607 > e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from 2608 > surrounding content by blank lines, and the start and end tags of the 2609 > block should not be indented with tabs or spaces. 2610 2611 In some ways Gruber's rule is more restrictive than the one given 2612 here: 2613 2614 - It requires that an HTML block be preceded by a blank line. 2615 - It does not allow the start tag to be indented. 2616 - It requires a matching end tag, which it also does not allow to 2617 be indented. 2618 2619 Most Markdown implementations (including some of Gruber's own) do not 2620 respect all of these restrictions. 2621 2622 There is one respect, however, in which Gruber's rule is more liberal 2623 than the one given here, since it allows blank lines to occur inside 2624 an HTML block. There are two reasons for disallowing them here. 2625 First, it removes the need to parse balanced tags, which is 2626 expensive and can require backtracking from the end of the document 2627 if no matching end tag is found. Second, it provides a very simple 2628 and flexible way of including Markdown content inside HTML tags: 2629 simply separate the Markdown from the HTML using blank lines: 2630 2631 Compare: 2632 2633 ```````````````````````````````` example 2634 <div> 2635 2636 *Emphasized* text. 2637 2638 </div> 2639 . 2640 <div> 2641 <p><em>Emphasized</em> text.</p> 2642 </div> 2643 ```````````````````````````````` 2644 2645 2646 ```````````````````````````````` example 2647 <div> 2648 *Emphasized* text. 2649 </div> 2650 . 2651 <div> 2652 *Emphasized* text. 2653 </div> 2654 ```````````````````````````````` 2655 2656 2657 Some Markdown implementations have adopted a convention of 2658 interpreting content inside tags as text if the open tag has 2659 the attribute `markdown=1`. The rule given above seems a simpler and 2660 more elegant way of achieving the same expressive power, which is also 2661 much simpler to parse. 2662 2663 The main potential drawback is that one can no longer paste HTML 2664 blocks into Markdown documents with 100% reliability. However, 2665 *in most cases* this will work fine, because the blank lines in 2666 HTML are usually followed by HTML block tags. For example: 2667 2668 ```````````````````````````````` example 2669 <table> 2670 2671 <tr> 2672 2673 <td> 2674 Hi 2675 </td> 2676 2677 </tr> 2678 2679 </table> 2680 . 2681 <table> 2682 <tr> 2683 <td> 2684 Hi 2685 </td> 2686 </tr> 2687 </table> 2688 ```````````````````````````````` 2689 2690 2691 There are problems, however, if the inner tags are indented 2692 *and* separated by spaces, as then they will be interpreted as 2693 an indented code block: 2694 2695 ```````````````````````````````` example 2696 <table> 2697 2698 <tr> 2699 2700 <td> 2701 Hi 2702 </td> 2703 2704 </tr> 2705 2706 </table> 2707 . 2708 <table> 2709 <tr> 2710 <pre><code><td> 2711 Hi 2712 </td> 2713 </code></pre> 2714 </tr> 2715 </table> 2716 ```````````````````````````````` 2717 2718 2719 Fortunately, blank lines are usually not necessary and can be 2720 deleted. The exception is inside `<pre>` tags, but as described 2721 above, raw HTML blocks starting with `<pre>` *can* contain blank 2722 lines. 2723 2724 ## Link reference definitions 2725 2726 A [link reference definition](@) 2727 consists of a [link label], indented up to three spaces, followed 2728 by a colon (`:`), optional [whitespace] (including up to one 2729 [line ending]), a [link destination], 2730 optional [whitespace] (including up to one 2731 [line ending]), and an optional [link 2732 title], which if it is present must be separated 2733 from the [link destination] by [whitespace]. 2734 No further [non-whitespace characters] may occur on the line. 2735 2736 A [link reference definition] 2737 does not correspond to a structural element of a document. Instead, it 2738 defines a label which can be used in [reference links] 2739 and reference-style [images] elsewhere in the document. [Link 2740 reference definitions] can come either before or after the links that use 2741 them. 2742 2743 ```````````````````````````````` example 2744 [foo]: /url "title" 2745 2746 [foo] 2747 . 2748 <p><a href="/url" title="title">foo</a></p> 2749 ```````````````````````````````` 2750 2751 2752 ```````````````````````````````` example 2753 [foo]: 2754 /url 2755 'the title' 2756 2757 [foo] 2758 . 2759 <p><a href="/url" title="the title">foo</a></p> 2760 ```````````````````````````````` 2761 2762 2763 ```````````````````````````````` example 2764 [Foo*bar\]]:my_(url) 'title (with parens)' 2765 2766 [Foo*bar\]] 2767 . 2768 <p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p> 2769 ```````````````````````````````` 2770 2771 2772 ```````````````````````````````` example 2773 [Foo bar]: 2774 <my%20url> 2775 'title' 2776 2777 [Foo bar] 2778 . 2779 <p><a href="my%20url" title="title">Foo bar</a></p> 2780 ```````````````````````````````` 2781 2782 2783 The title may extend over multiple lines: 2784 2785 ```````````````````````````````` example 2786 [foo]: /url ' 2787 title 2788 line1 2789 line2 2790 ' 2791 2792 [foo] 2793 . 2794 <p><a href="/url" title=" 2795 title 2796 line1 2797 line2 2798 ">foo</a></p> 2799 ```````````````````````````````` 2800 2801 2802 However, it may not contain a [blank line]: 2803 2804 ```````````````````````````````` example 2805 [foo]: /url 'title 2806 2807 with blank line' 2808 2809 [foo] 2810 . 2811 <p>[foo]: /url 'title</p> 2812 <p>with blank line'</p> 2813 <p>[foo]</p> 2814 ```````````````````````````````` 2815 2816 2817 The title may be omitted: 2818 2819 ```````````````````````````````` example 2820 [foo]: 2821 /url 2822 2823 [foo] 2824 . 2825 <p><a href="/url">foo</a></p> 2826 ```````````````````````````````` 2827 2828 2829 The link destination may not be omitted: 2830 2831 ```````````````````````````````` example 2832 [foo]: 2833 2834 [foo] 2835 . 2836 <p>[foo]:</p> 2837 <p>[foo]</p> 2838 ```````````````````````````````` 2839 2840 2841 Both title and destination can contain backslash escapes 2842 and literal backslashes: 2843 2844 ```````````````````````````````` example 2845 [foo]: /url\bar\*baz "foo\"bar\baz" 2846 2847 [foo] 2848 . 2849 <p><a href="/url%5Cbar*baz" title="foo"bar\baz">foo</a></p> 2850 ```````````````````````````````` 2851 2852 2853 A link can come before its corresponding definition: 2854 2855 ```````````````````````````````` example 2856 [foo] 2857 2858 [foo]: url 2859 . 2860 <p><a href="url">foo</a></p> 2861 ```````````````````````````````` 2862 2863 2864 If there are several matching definitions, the first one takes 2865 precedence: 2866 2867 ```````````````````````````````` example 2868 [foo] 2869 2870 [foo]: first 2871 [foo]: second 2872 . 2873 <p><a href="first">foo</a></p> 2874 ```````````````````````````````` 2875 2876 2877 As noted in the section on [Links], matching of labels is 2878 case-insensitive (see [matches]). 2879 2880 ```````````````````````````````` example 2881 [FOO]: /url 2882 2883 [Foo] 2884 . 2885 <p><a href="/url">Foo</a></p> 2886 ```````````````````````````````` 2887 2888 2889 ```````````````````````````````` example 2890 [ΑΓΩ]: /φου 2891 2892 [αγω] 2893 . 2894 <p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p> 2895 ```````````````````````````````` 2896 2897 2898 Here is a link reference definition with no corresponding link. 2899 It contributes nothing to the document. 2900 2901 ```````````````````````````````` example 2902 [foo]: /url 2903 . 2904 ```````````````````````````````` 2905 2906 2907 Here is another one: 2908 2909 ```````````````````````````````` example 2910 [ 2911 foo 2912 ]: /url 2913 bar 2914 . 2915 <p>bar</p> 2916 ```````````````````````````````` 2917 2918 2919 This is not a link reference definition, because there are 2920 [non-whitespace characters] after the title: 2921 2922 ```````````````````````````````` example 2923 [foo]: /url "title" ok 2924 . 2925 <p>[foo]: /url "title" ok</p> 2926 ```````````````````````````````` 2927 2928 2929 This is a link reference definition, but it has no title: 2930 2931 ```````````````````````````````` example 2932 [foo]: /url 2933 "title" ok 2934 . 2935 <p>"title" ok</p> 2936 ```````````````````````````````` 2937 2938 2939 This is not a link reference definition, because it is indented 2940 four spaces: 2941 2942 ```````````````````````````````` example 2943 [foo]: /url "title" 2944 2945 [foo] 2946 . 2947 <pre><code>[foo]: /url "title" 2948 </code></pre> 2949 <p>[foo]</p> 2950 ```````````````````````````````` 2951 2952 2953 This is not a link reference definition, because it occurs inside 2954 a code block: 2955 2956 ```````````````````````````````` example 2957 ``` 2958 [foo]: /url 2959 ``` 2960 2961 [foo] 2962 . 2963 <pre><code>[foo]: /url 2964 </code></pre> 2965 <p>[foo]</p> 2966 ```````````````````````````````` 2967 2968 2969 A [link reference definition] cannot interrupt a paragraph. 2970 2971 ```````````````````````````````` example 2972 Foo 2973 [bar]: /baz 2974 2975 [bar] 2976 . 2977 <p>Foo 2978 [bar]: /baz</p> 2979 <p>[bar]</p> 2980 ```````````````````````````````` 2981 2982 2983 However, it can directly follow other block elements, such as headings 2984 and thematic breaks, and it need not be followed by a blank line. 2985 2986 ```````````````````````````````` example 2987 # [Foo] 2988 [foo]: /url 2989 > bar 2990 . 2991 <h1><a href="/url">Foo</a></h1> 2992 <blockquote> 2993 <p>bar</p> 2994 </blockquote> 2995 ```````````````````````````````` 2996 2997 2998 Several [link reference definitions] 2999 can occur one after another, without intervening blank lines. 3000 3001 ```````````````````````````````` example 3002 [foo]: /foo-url "foo" 3003 [bar]: /bar-url 3004 "bar" 3005 [baz]: /baz-url 3006 3007 [foo], 3008 [bar], 3009 [baz] 3010 . 3011 <p><a href="/foo-url" title="foo">foo</a>, 3012 <a href="/bar-url" title="bar">bar</a>, 3013 <a href="/baz-url">baz</a></p> 3014 ```````````````````````````````` 3015 3016 3017 [Link reference definitions] can occur 3018 inside block containers, like lists and block quotations. They 3019 affect the entire document, not just the container in which they 3020 are defined: 3021 3022 ```````````````````````````````` example 3023 [foo] 3024 3025 > [foo]: /url 3026 . 3027 <p><a href="/url">foo</a></p> 3028 <blockquote> 3029 </blockquote> 3030 ```````````````````````````````` 3031 3032 3033 3034 ## Paragraphs 3035 3036 A sequence of non-blank lines that cannot be interpreted as other 3037 kinds of blocks forms a [paragraph](@). 3038 The contents of the paragraph are the result of parsing the 3039 paragraph's raw content as inlines. The paragraph's raw content 3040 is formed by concatenating the lines and removing initial and final 3041 [whitespace]. 3042 3043 A simple example with two paragraphs: 3044 3045 ```````````````````````````````` example 3046 aaa 3047 3048 bbb 3049 . 3050 <p>aaa</p> 3051 <p>bbb</p> 3052 ```````````````````````````````` 3053 3054 3055 Paragraphs can contain multiple lines, but no blank lines: 3056 3057 ```````````````````````````````` example 3058 aaa 3059 bbb 3060 3061 ccc 3062 ddd 3063 . 3064 <p>aaa 3065 bbb</p> 3066 <p>ccc 3067 ddd</p> 3068 ```````````````````````````````` 3069 3070 3071 Multiple blank lines between paragraph have no effect: 3072 3073 ```````````````````````````````` example 3074 aaa 3075 3076 3077 bbb 3078 . 3079 <p>aaa</p> 3080 <p>bbb</p> 3081 ```````````````````````````````` 3082 3083 3084 Leading spaces are skipped: 3085 3086 ```````````````````````````````` example 3087 aaa 3088 bbb 3089 . 3090 <p>aaa 3091 bbb</p> 3092 ```````````````````````````````` 3093 3094 3095 Lines after the first may be indented any amount, since indented 3096 code blocks cannot interrupt paragraphs. 3097 3098 ```````````````````````````````` example 3099 aaa 3100 bbb 3101 ccc 3102 . 3103 <p>aaa 3104 bbb 3105 ccc</p> 3106 ```````````````````````````````` 3107 3108 3109 However, the first line may be indented at most three spaces, 3110 or an indented code block will be triggered: 3111 3112 ```````````````````````````````` example 3113 aaa 3114 bbb 3115 . 3116 <p>aaa 3117 bbb</p> 3118 ```````````````````````````````` 3119 3120 3121 ```````````````````````````````` example 3122 aaa 3123 bbb 3124 . 3125 <pre><code>aaa 3126 </code></pre> 3127 <p>bbb</p> 3128 ```````````````````````````````` 3129 3130 3131 Final spaces are stripped before inline parsing, so a paragraph 3132 that ends with two or more spaces will not end with a [hard line 3133 break]: 3134 3135 ```````````````````````````````` example 3136 aaa 3137 bbb 3138 . 3139 <p>aaa<br /> 3140 bbb</p> 3141 ```````````````````````````````` 3142 3143 3144 ## Blank lines 3145 3146 [Blank lines] between block-level elements are ignored, 3147 except for the role they play in determining whether a [list] 3148 is [tight] or [loose]. 3149 3150 Blank lines at the beginning and end of the document are also ignored. 3151 3152 ```````````````````````````````` example 3153 3154 3155 aaa 3156 3157 3158 # aaa 3159 3160 3161 . 3162 <p>aaa</p> 3163 <h1>aaa</h1> 3164 ```````````````````````````````` 3165 3166 3167 3168 # Container blocks 3169 3170 A [container block] is a block that has other 3171 blocks as its contents. There are two basic kinds of container blocks: 3172 [block quotes] and [list items]. 3173 [Lists] are meta-containers for [list items]. 3174 3175 We define the syntax for container blocks recursively. The general 3176 form of the definition is: 3177 3178 > If X is a sequence of blocks, then the result of 3179 > transforming X in such-and-such a way is a container of type Y 3180 > with these blocks as its content. 3181 3182 So, we explain what counts as a block quote or list item by explaining 3183 how these can be *generated* from their contents. This should suffice 3184 to define the syntax, although it does not give a recipe for *parsing* 3185 these constructions. (A recipe is provided below in the section entitled 3186 [A parsing strategy](#appendix-a-parsing-strategy).) 3187 3188 ## Block quotes 3189 3190 A [block quote marker](@) 3191 consists of 0-3 spaces of initial indent, plus (a) the character `>` together 3192 with a following space, or (b) a single character `>` not followed by a space. 3193 3194 The following rules define [block quotes]: 3195 3196 1. **Basic case.** If a string of lines *Ls* constitute a sequence 3197 of blocks *Bs*, then the result of prepending a [block quote 3198 marker] to the beginning of each line in *Ls* 3199 is a [block quote](#block-quotes) containing *Bs*. 3200 3201 2. **Laziness.** If a string of lines *Ls* constitute a [block 3202 quote](#block-quotes) with contents *Bs*, then the result of deleting 3203 the initial [block quote marker] from one or 3204 more lines in which the next [non-whitespace character] after the [block 3205 quote marker] is [paragraph continuation 3206 text] is a block quote with *Bs* as its content. 3207 [Paragraph continuation text](@) is text 3208 that will be parsed as part of the content of a paragraph, but does 3209 not occur at the beginning of the paragraph. 3210 3211 3. **Consecutiveness.** A document cannot contain two [block 3212 quotes] in a row unless there is a [blank line] between them. 3213 3214 Nothing else counts as a [block quote](#block-quotes). 3215 3216 Here is a simple example: 3217 3218 ```````````````````````````````` example 3219 > # Foo 3220 > bar 3221 > baz 3222 . 3223 <blockquote> 3224 <h1>Foo</h1> 3225 <p>bar 3226 baz</p> 3227 </blockquote> 3228 ```````````````````````````````` 3229 3230 3231 The spaces after the `>` characters can be omitted: 3232 3233 ```````````````````````````````` example 3234 ># Foo 3235 >bar 3236 > baz 3237 . 3238 <blockquote> 3239 <h1>Foo</h1> 3240 <p>bar 3241 baz</p> 3242 </blockquote> 3243 ```````````````````````````````` 3244 3245 3246 The `>` characters can be indented 1-3 spaces: 3247 3248 ```````````````````````````````` example 3249 > # Foo 3250 > bar 3251 > baz 3252 . 3253 <blockquote> 3254 <h1>Foo</h1> 3255 <p>bar 3256 baz</p> 3257 </blockquote> 3258 ```````````````````````````````` 3259 3260 3261 Four spaces gives us a code block: 3262 3263 ```````````````````````````````` example 3264 > # Foo 3265 > bar 3266 > baz 3267 . 3268 <pre><code>> # Foo 3269 > bar 3270 > baz 3271 </code></pre> 3272 ```````````````````````````````` 3273 3274 3275 The Laziness clause allows us to omit the `>` before 3276 [paragraph continuation text]: 3277 3278 ```````````````````````````````` example 3279 > # Foo 3280 > bar 3281 baz 3282 . 3283 <blockquote> 3284 <h1>Foo</h1> 3285 <p>bar 3286 baz</p> 3287 </blockquote> 3288 ```````````````````````````````` 3289 3290 3291 A block quote can contain some lazy and some non-lazy 3292 continuation lines: 3293 3294 ```````````````````````````````` example 3295 > bar 3296 baz 3297 > foo 3298 . 3299 <blockquote> 3300 <p>bar 3301 baz 3302 foo</p> 3303 </blockquote> 3304 ```````````````````````````````` 3305 3306 3307 Laziness only applies to lines that would have been continuations of 3308 paragraphs had they been prepended with [block quote markers]. 3309 For example, the `> ` cannot be omitted in the second line of 3310 3311 ``` markdown 3312 > foo 3313 > --- 3314 ``` 3315 3316 without changing the meaning: 3317 3318 ```````````````````````````````` example 3319 > foo 3320 --- 3321 . 3322 <blockquote> 3323 <p>foo</p> 3324 </blockquote> 3325 <hr /> 3326 ```````````````````````````````` 3327 3328 3329 Similarly, if we omit the `> ` in the second line of 3330 3331 ``` markdown 3332 > - foo 3333 > - bar 3334 ``` 3335 3336 then the block quote ends after the first line: 3337 3338 ```````````````````````````````` example 3339 > - foo 3340 - bar 3341 . 3342 <blockquote> 3343 <ul> 3344 <li>foo</li> 3345 </ul> 3346 </blockquote> 3347 <ul> 3348 <li>bar</li> 3349 </ul> 3350 ```````````````````````````````` 3351 3352 3353 For the same reason, we can't omit the `> ` in front of 3354 subsequent lines of an indented or fenced code block: 3355 3356 ```````````````````````````````` example 3357 > foo 3358 bar 3359 . 3360 <blockquote> 3361 <pre><code>foo 3362 </code></pre> 3363 </blockquote> 3364 <pre><code>bar 3365 </code></pre> 3366 ```````````````````````````````` 3367 3368 3369 ```````````````````````````````` example 3370 > ``` 3371 foo 3372 ``` 3373 . 3374 <blockquote> 3375 <pre><code></code></pre> 3376 </blockquote> 3377 <p>foo</p> 3378 <pre><code></code></pre> 3379 ```````````````````````````````` 3380 3381 3382 Note that in the following case, we have a [lazy 3383 continuation line]: 3384 3385 ```````````````````````````````` example 3386 > foo 3387 - bar 3388 . 3389 <blockquote> 3390 <p>foo 3391 - bar</p> 3392 </blockquote> 3393 ```````````````````````````````` 3394 3395 3396 To see why, note that in 3397 3398 ```markdown 3399 > foo 3400 > - bar 3401 ``` 3402 3403 the `- bar` is indented too far to start a list, and can't 3404 be an indented code block because indented code blocks cannot 3405 interrupt paragraphs, so it is [paragraph continuation text]. 3406 3407 A block quote can be empty: 3408 3409 ```````````````````````````````` example 3410 > 3411 . 3412 <blockquote> 3413 </blockquote> 3414 ```````````````````````````````` 3415 3416 3417 ```````````````````````````````` example 3418 > 3419 > 3420 > 3421 . 3422 <blockquote> 3423 </blockquote> 3424 ```````````````````````````````` 3425 3426 3427 A block quote can have initial or final blank lines: 3428 3429 ```````````````````````````````` example 3430 > 3431 > foo 3432 > 3433 . 3434 <blockquote> 3435 <p>foo</p> 3436 </blockquote> 3437 ```````````````````````````````` 3438 3439 3440 A blank line always separates block quotes: 3441 3442 ```````````````````````````````` example 3443 > foo 3444 3445 > bar 3446 . 3447 <blockquote> 3448 <p>foo</p> 3449 </blockquote> 3450 <blockquote> 3451 <p>bar</p> 3452 </blockquote> 3453 ```````````````````````````````` 3454 3455 3456 (Most current Markdown implementations, including John Gruber's 3457 original `Markdown.pl`, will parse this example as a single block quote 3458 with two paragraphs. But it seems better to allow the author to decide 3459 whether two block quotes or one are wanted.) 3460 3461 Consecutiveness means that if we put these block quotes together, 3462 we get a single block quote: 3463 3464 ```````````````````````````````` example 3465 > foo 3466 > bar 3467 . 3468 <blockquote> 3469 <p>foo 3470 bar</p> 3471 </blockquote> 3472 ```````````````````````````````` 3473 3474 3475 To get a block quote with two paragraphs, use: 3476 3477 ```````````````````````````````` example 3478 > foo 3479 > 3480 > bar 3481 . 3482 <blockquote> 3483 <p>foo</p> 3484 <p>bar</p> 3485 </blockquote> 3486 ```````````````````````````````` 3487 3488 3489 Block quotes can interrupt paragraphs: 3490 3491 ```````````````````````````````` example 3492 foo 3493 > bar 3494 . 3495 <p>foo</p> 3496 <blockquote> 3497 <p>bar</p> 3498 </blockquote> 3499 ```````````````````````````````` 3500 3501 3502 In general, blank lines are not needed before or after block 3503 quotes: 3504 3505 ```````````````````````````````` example 3506 > aaa 3507 *** 3508 > bbb 3509 . 3510 <blockquote> 3511 <p>aaa</p> 3512 </blockquote> 3513 <hr /> 3514 <blockquote> 3515 <p>bbb</p> 3516 </blockquote> 3517 ```````````````````````````````` 3518 3519 3520 However, because of laziness, a blank line is needed between 3521 a block quote and a following paragraph: 3522 3523 ```````````````````````````````` example 3524 > bar 3525 baz 3526 . 3527 <blockquote> 3528 <p>bar 3529 baz</p> 3530 </blockquote> 3531 ```````````````````````````````` 3532 3533 3534 ```````````````````````````````` example 3535 > bar 3536 3537 baz 3538 . 3539 <blockquote> 3540 <p>bar</p> 3541 </blockquote> 3542 <p>baz</p> 3543 ```````````````````````````````` 3544 3545 3546 ```````````````````````````````` example 3547 > bar 3548 > 3549 baz 3550 . 3551 <blockquote> 3552 <p>bar</p> 3553 </blockquote> 3554 <p>baz</p> 3555 ```````````````````````````````` 3556 3557 3558 It is a consequence of the Laziness rule that any number 3559 of initial `>`s may be omitted on a continuation line of a 3560 nested block quote: 3561 3562 ```````````````````````````````` example 3563 > > > foo 3564 bar 3565 . 3566 <blockquote> 3567 <blockquote> 3568 <blockquote> 3569 <p>foo 3570 bar</p> 3571 </blockquote> 3572 </blockquote> 3573 </blockquote> 3574 ```````````````````````````````` 3575 3576 3577 ```````````````````````````````` example 3578 >>> foo 3579 > bar 3580 >>baz 3581 . 3582 <blockquote> 3583 <blockquote> 3584 <blockquote> 3585 <p>foo 3586 bar 3587 baz</p> 3588 </blockquote> 3589 </blockquote> 3590 </blockquote> 3591 ```````````````````````````````` 3592 3593 3594 When including an indented code block in a block quote, 3595 remember that the [block quote marker] includes 3596 both the `>` and a following space. So *five spaces* are needed after 3597 the `>`: 3598 3599 ```````````````````````````````` example 3600 > code 3601 3602 > not code 3603 . 3604 <blockquote> 3605 <pre><code>code 3606 </code></pre> 3607 </blockquote> 3608 <blockquote> 3609 <p>not code</p> 3610 </blockquote> 3611 ```````````````````````````````` 3612 3613 3614 3615 ## List items 3616 3617 A [list marker](@) is a 3618 [bullet list marker] or an [ordered list marker]. 3619 3620 A [bullet list marker](@) 3621 is a `-`, `+`, or `*` character. 3622 3623 An [ordered list marker](@) 3624 is a sequence of 1--9 arabic digits (`0-9`), followed by either a 3625 `.` character or a `)` character. (The reason for the length 3626 limit is that with 10 digits we start seeing integer overflows 3627 in some browsers.) 3628 3629 The following rules define [list items]: 3630 3631 1. **Basic case.** If a sequence of lines *Ls* constitute a sequence of 3632 blocks *Bs* starting with a [non-whitespace character] and not separated 3633 from each other by more than one blank line, and *M* is a list 3634 marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result 3635 of prepending *M* and the following spaces to the first line of 3636 *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a 3637 list item with *Bs* as its contents. The type of the list item 3638 (bullet or ordered) is determined by the type of its list marker. 3639 If the list item is ordered, then it is also assigned a start 3640 number, based on the ordered list marker. 3641 3642 Exceptions: When the first list item in a [list] interrupts 3643 a paragraph---that is, when it starts on a line that would 3644 otherwise count as [paragraph continuation text]---then (a) 3645 the lines *Ls* must not begin with a blank line, and (b) if 3646 the list item is ordered, the start number must be 1. 3647 3648 For example, let *Ls* be the lines 3649 3650 ```````````````````````````````` example 3651 A paragraph 3652 with two lines. 3653 3654 indented code 3655 3656 > A block quote. 3657 . 3658 <p>A paragraph 3659 with two lines.</p> 3660 <pre><code>indented code 3661 </code></pre> 3662 <blockquote> 3663 <p>A block quote.</p> 3664 </blockquote> 3665 ```````````````````````````````` 3666 3667 3668 And let *M* be the marker `1.`, and *N* = 2. Then rule #1 says 3669 that the following is an ordered list item with start number 1, 3670 and the same contents as *Ls*: 3671 3672 ```````````````````````````````` example 3673 1. A paragraph 3674 with two lines. 3675 3676 indented code 3677 3678 > A block quote. 3679 . 3680 <ol> 3681 <li> 3682 <p>A paragraph 3683 with two lines.</p> 3684 <pre><code>indented code 3685 </code></pre> 3686 <blockquote> 3687 <p>A block quote.</p> 3688 </blockquote> 3689 </li> 3690 </ol> 3691 ```````````````````````````````` 3692 3693 3694 The most important thing to notice is that the position of 3695 the text after the list marker determines how much indentation 3696 is needed in subsequent blocks in the list item. If the list 3697 marker takes up two spaces, and there are three spaces between 3698 the list marker and the next [non-whitespace character], then blocks 3699 must be indented five spaces in order to fall under the list 3700 item. 3701 3702 Here are some examples showing how far content must be indented to be 3703 put under the list item: 3704 3705 ```````````````````````````````` example 3706 - one 3707 3708 two 3709 . 3710 <ul> 3711 <li>one</li> 3712 </ul> 3713 <p>two</p> 3714 ```````````````````````````````` 3715 3716 3717 ```````````````````````````````` example 3718 - one 3719 3720 two 3721 . 3722 <ul> 3723 <li> 3724 <p>one</p> 3725 <p>two</p> 3726 </li> 3727 </ul> 3728 ```````````````````````````````` 3729 3730 3731 ```````````````````````````````` example 3732 - one 3733 3734 two 3735 . 3736 <ul> 3737 <li>one</li> 3738 </ul> 3739 <pre><code> two 3740 </code></pre> 3741 ```````````````````````````````` 3742 3743 3744 ```````````````````````````````` example 3745 - one 3746 3747 two 3748 . 3749 <ul> 3750 <li> 3751 <p>one</p> 3752 <p>two</p> 3753 </li> 3754 </ul> 3755 ```````````````````````````````` 3756 3757 3758 It is tempting to think of this in terms of columns: the continuation 3759 blocks must be indented at least to the column of the first 3760 [non-whitespace character] after the list marker. However, that is not quite right. 3761 The spaces after the list marker determine how much relative indentation 3762 is needed. Which column this indentation reaches will depend on 3763 how the list item is embedded in other constructions, as shown by 3764 this example: 3765 3766 ```````````````````````````````` example 3767 > > 1. one 3768 >> 3769 >> two 3770 . 3771 <blockquote> 3772 <blockquote> 3773 <ol> 3774 <li> 3775 <p>one</p> 3776 <p>two</p> 3777 </li> 3778 </ol> 3779 </blockquote> 3780 </blockquote> 3781 ```````````````````````````````` 3782 3783 3784 Here `two` occurs in the same column as the list marker `1.`, 3785 but is actually contained in the list item, because there is 3786 sufficient indentation after the last containing blockquote marker. 3787 3788 The converse is also possible. In the following example, the word `two` 3789 occurs far to the right of the initial text of the list item, `one`, but 3790 it is not considered part of the list item, because it is not indented 3791 far enough past the blockquote marker: 3792 3793 ```````````````````````````````` example 3794 >>- one 3795 >> 3796 > > two 3797 . 3798 <blockquote> 3799 <blockquote> 3800 <ul> 3801 <li>one</li> 3802 </ul> 3803 <p>two</p> 3804 </blockquote> 3805 </blockquote> 3806 ```````````````````````````````` 3807 3808 3809 Note that at least one space is needed between the list marker and 3810 any following content, so these are not list items: 3811 3812 ```````````````````````````````` example 3813 -one 3814 3815 2.two 3816 . 3817 <p>-one</p> 3818 <p>2.two</p> 3819 ```````````````````````````````` 3820 3821 3822 A list item may contain blocks that are separated by more than 3823 one blank line. 3824 3825 ```````````````````````````````` example 3826 - foo 3827 3828 3829 bar 3830 . 3831 <ul> 3832 <li> 3833 <p>foo</p> 3834 <p>bar</p> 3835 </li> 3836 </ul> 3837 ```````````````````````````````` 3838 3839 3840 A list item may contain any kind of block: 3841 3842 ```````````````````````````````` example 3843 1. foo 3844 3845 ``` 3846 bar 3847 ``` 3848 3849 baz 3850 3851 > bam 3852 . 3853 <ol> 3854 <li> 3855 <p>foo</p> 3856 <pre><code>bar 3857 </code></pre> 3858 <p>baz</p> 3859 <blockquote> 3860 <p>bam</p> 3861 </blockquote> 3862 </li> 3863 </ol> 3864 ```````````````````````````````` 3865 3866 3867 A list item that contains an indented code block will preserve 3868 empty lines within the code block verbatim. 3869 3870 ```````````````````````````````` example 3871 - Foo 3872 3873 bar 3874 3875 3876 baz 3877 . 3878 <ul> 3879 <li> 3880 <p>Foo</p> 3881 <pre><code>bar 3882 3883 3884 baz 3885 </code></pre> 3886 </li> 3887 </ul> 3888 ```````````````````````````````` 3889 3890 Note that ordered list start numbers must be nine digits or less: 3891 3892 ```````````````````````````````` example 3893 123456789. ok 3894 . 3895 <ol start="123456789"> 3896 <li>ok</li> 3897 </ol> 3898 ```````````````````````````````` 3899 3900 3901 ```````````````````````````````` example 3902 1234567890. not ok 3903 . 3904 <p>1234567890. not ok</p> 3905 ```````````````````````````````` 3906 3907 3908 A start number may begin with 0s: 3909 3910 ```````````````````````````````` example 3911 0. ok 3912 . 3913 <ol start="0"> 3914 <li>ok</li> 3915 </ol> 3916 ```````````````````````````````` 3917 3918 3919 ```````````````````````````````` example 3920 003. ok 3921 . 3922 <ol start="3"> 3923 <li>ok</li> 3924 </ol> 3925 ```````````````````````````````` 3926 3927 3928 A start number may not be negative: 3929 3930 ```````````````````````````````` example 3931 -1. not ok 3932 . 3933 <p>-1. not ok</p> 3934 ```````````````````````````````` 3935 3936 3937 3938 2. **Item starting with indented code.** If a sequence of lines *Ls* 3939 constitute a sequence of blocks *Bs* starting with an indented code 3940 block and not separated from each other by more than one blank line, 3941 and *M* is a list marker of width *W* followed by 3942 one space, then the result of prepending *M* and the following 3943 space to the first line of *Ls*, and indenting subsequent lines of 3944 *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents. 3945 If a line is empty, then it need not be indented. The type of the 3946 list item (bullet or ordered) is determined by the type of its list 3947 marker. If the list item is ordered, then it is also assigned a 3948 start number, based on the ordered list marker. 3949 3950 An indented code block will have to be indented four spaces beyond 3951 the edge of the region where text will be included in the list item. 3952 In the following case that is 6 spaces: 3953 3954 ```````````````````````````````` example 3955 - foo 3956 3957 bar 3958 . 3959 <ul> 3960 <li> 3961 <p>foo</p> 3962 <pre><code>bar 3963 </code></pre> 3964 </li> 3965 </ul> 3966 ```````````````````````````````` 3967 3968 3969 And in this case it is 11 spaces: 3970 3971 ```````````````````````````````` example 3972 10. foo 3973 3974 bar 3975 . 3976 <ol start="10"> 3977 <li> 3978 <p>foo</p> 3979 <pre><code>bar 3980 </code></pre> 3981 </li> 3982 </ol> 3983 ```````````````````````````````` 3984 3985 3986 If the *first* block in the list item is an indented code block, 3987 then by rule #2, the contents must be indented *one* space after the 3988 list marker: 3989 3990 ```````````````````````````````` example 3991 indented code 3992 3993 paragraph 3994 3995 more code 3996 . 3997 <pre><code>indented code 3998 </code></pre> 3999 <p>paragraph</p> 4000 <pre><code>more code 4001 </code></pre> 4002 ```````````````````````````````` 4003 4004 4005 ```````````````````````````````` example 4006 1. indented code 4007 4008 paragraph 4009 4010 more code 4011 . 4012 <ol> 4013 <li> 4014 <pre><code>indented code 4015 </code></pre> 4016 <p>paragraph</p> 4017 <pre><code>more code 4018 </code></pre> 4019 </li> 4020 </ol> 4021 ```````````````````````````````` 4022 4023 4024 Note that an additional space indent is interpreted as space 4025 inside the code block: 4026 4027 ```````````````````````````````` example 4028 1. indented code 4029 4030 paragraph 4031 4032 more code 4033 . 4034 <ol> 4035 <li> 4036 <pre><code> indented code 4037 </code></pre> 4038 <p>paragraph</p> 4039 <pre><code>more code 4040 </code></pre> 4041 </li> 4042 </ol> 4043 ```````````````````````````````` 4044 4045 4046 Note that rules #1 and #2 only apply to two cases: (a) cases 4047 in which the lines to be included in a list item begin with a 4048 [non-whitespace character], and (b) cases in which 4049 they begin with an indented code 4050 block. In a case like the following, where the first block begins with 4051 a three-space indent, the rules do not allow us to form a list item by 4052 indenting the whole thing and prepending a list marker: 4053 4054 ```````````````````````````````` example 4055 foo 4056 4057 bar 4058 . 4059 <p>foo</p> 4060 <p>bar</p> 4061 ```````````````````````````````` 4062 4063 4064 ```````````````````````````````` example 4065 - foo 4066 4067 bar 4068 . 4069 <ul> 4070 <li>foo</li> 4071 </ul> 4072 <p>bar</p> 4073 ```````````````````````````````` 4074 4075 4076 This is not a significant restriction, because when a block begins 4077 with 1-3 spaces indent, the indentation can always be removed without 4078 a change in interpretation, allowing rule #1 to be applied. So, in 4079 the above case: 4080 4081 ```````````````````````````````` example 4082 - foo 4083 4084 bar 4085 . 4086 <ul> 4087 <li> 4088 <p>foo</p> 4089 <p>bar</p> 4090 </li> 4091 </ul> 4092 ```````````````````````````````` 4093 4094 4095 3. **Item starting with a blank line.** If a sequence of lines *Ls* 4096 starting with a single [blank line] constitute a (possibly empty) 4097 sequence of blocks *Bs*, not separated from each other by more than 4098 one blank line, and *M* is a list marker of width *W*, 4099 then the result of prepending *M* to the first line of *Ls*, and 4100 indenting subsequent lines of *Ls* by *W + 1* spaces, is a list 4101 item with *Bs* as its contents. 4102 If a line is empty, then it need not be indented. The type of the 4103 list item (bullet or ordered) is determined by the type of its list 4104 marker. If the list item is ordered, then it is also assigned a 4105 start number, based on the ordered list marker. 4106 4107 Here are some list items that start with a blank line but are not empty: 4108 4109 ```````````````````````````````` example 4110 - 4111 foo 4112 - 4113 ``` 4114 bar 4115 ``` 4116 - 4117 baz 4118 . 4119 <ul> 4120 <li>foo</li> 4121 <li> 4122 <pre><code>bar 4123 </code></pre> 4124 </li> 4125 <li> 4126 <pre><code>baz 4127 </code></pre> 4128 </li> 4129 </ul> 4130 ```````````````````````````````` 4131 4132 When the list item starts with a blank line, the number of spaces 4133 following the list marker doesn't change the required indentation: 4134 4135 ```````````````````````````````` example 4136 - 4137 foo 4138 . 4139 <ul> 4140 <li>foo</li> 4141 </ul> 4142 ```````````````````````````````` 4143 4144 4145 A list item can begin with at most one blank line. 4146 In the following example, `foo` is not part of the list 4147 item: 4148 4149 ```````````````````````````````` example 4150 - 4151 4152 foo 4153 . 4154 <ul> 4155 <li></li> 4156 </ul> 4157 <p>foo</p> 4158 ```````````````````````````````` 4159 4160 4161 Here is an empty bullet list item: 4162 4163 ```````````````````````````````` example 4164 - foo 4165 - 4166 - bar 4167 . 4168 <ul> 4169 <li>foo</li> 4170 <li></li> 4171 <li>bar</li> 4172 </ul> 4173 ```````````````````````````````` 4174 4175 4176 It does not matter whether there are spaces following the [list marker]: 4177 4178 ```````````````````````````````` example 4179 - foo 4180 - 4181 - bar 4182 . 4183 <ul> 4184 <li>foo</li> 4185 <li></li> 4186 <li>bar</li> 4187 </ul> 4188 ```````````````````````````````` 4189 4190 4191 Here is an empty ordered list item: 4192 4193 ```````````````````````````````` example 4194 1. foo 4195 2. 4196 3. bar 4197 . 4198 <ol> 4199 <li>foo</li> 4200 <li></li> 4201 <li>bar</li> 4202 </ol> 4203 ```````````````````````````````` 4204 4205 4206 A list may start or end with an empty list item: 4207 4208 ```````````````````````````````` example 4209 * 4210 . 4211 <ul> 4212 <li></li> 4213 </ul> 4214 ```````````````````````````````` 4215 4216 However, an empty list item cannot interrupt a paragraph: 4217 4218 ```````````````````````````````` example 4219 foo 4220 * 4221 4222 foo 4223 1. 4224 . 4225 <p>foo 4226 *</p> 4227 <p>foo 4228 1.</p> 4229 ```````````````````````````````` 4230 4231 4232 4. **Indentation.** If a sequence of lines *Ls* constitutes a list item 4233 according to rule #1, #2, or #3, then the result of indenting each line 4234 of *Ls* by 1-3 spaces (the same for each line) also constitutes a 4235 list item with the same contents and attributes. If a line is 4236 empty, then it need not be indented. 4237 4238 Indented one space: 4239 4240 ```````````````````````````````` example 4241 1. A paragraph 4242 with two lines. 4243 4244 indented code 4245 4246 > A block quote. 4247 . 4248 <ol> 4249 <li> 4250 <p>A paragraph 4251 with two lines.</p> 4252 <pre><code>indented code 4253 </code></pre> 4254 <blockquote> 4255 <p>A block quote.</p> 4256 </blockquote> 4257 </li> 4258 </ol> 4259 ```````````````````````````````` 4260 4261 4262 Indented two spaces: 4263 4264 ```````````````````````````````` example 4265 1. A paragraph 4266 with two lines. 4267 4268 indented code 4269 4270 > A block quote. 4271 . 4272 <ol> 4273 <li> 4274 <p>A paragraph 4275 with two lines.</p> 4276 <pre><code>indented code 4277 </code></pre> 4278 <blockquote> 4279 <p>A block quote.</p> 4280 </blockquote> 4281 </li> 4282 </ol> 4283 ```````````````````````````````` 4284 4285 4286 Indented three spaces: 4287 4288 ```````````````````````````````` example 4289 1. A paragraph 4290 with two lines. 4291 4292 indented code 4293 4294 > A block quote. 4295 . 4296 <ol> 4297 <li> 4298 <p>A paragraph 4299 with two lines.</p> 4300 <pre><code>indented code 4301 </code></pre> 4302 <blockquote> 4303 <p>A block quote.</p> 4304 </blockquote> 4305 </li> 4306 </ol> 4307 ```````````````````````````````` 4308 4309 4310 Four spaces indent gives a code block: 4311 4312 ```````````````````````````````` example 4313 1. A paragraph 4314 with two lines. 4315 4316 indented code 4317 4318 > A block quote. 4319 . 4320 <pre><code>1. A paragraph 4321 with two lines. 4322 4323 indented code 4324 4325 > A block quote. 4326 </code></pre> 4327 ```````````````````````````````` 4328 4329 4330 4331 5. **Laziness.** If a string of lines *Ls* constitute a [list 4332 item](#list-items) with contents *Bs*, then the result of deleting 4333 some or all of the indentation from one or more lines in which the 4334 next [non-whitespace character] after the indentation is 4335 [paragraph continuation text] is a 4336 list item with the same contents and attributes. The unindented 4337 lines are called 4338 [lazy continuation line](@)s. 4339 4340 Here is an example with [lazy continuation lines]: 4341 4342 ```````````````````````````````` example 4343 1. A paragraph 4344 with two lines. 4345 4346 indented code 4347 4348 > A block quote. 4349 . 4350 <ol> 4351 <li> 4352 <p>A paragraph 4353 with two lines.</p> 4354 <pre><code>indented code 4355 </code></pre> 4356 <blockquote> 4357 <p>A block quote.</p> 4358 </blockquote> 4359 </li> 4360 </ol> 4361 ```````````````````````````````` 4362 4363 4364 Indentation can be partially deleted: 4365 4366 ```````````````````````````````` example 4367 1. A paragraph 4368 with two lines. 4369 . 4370 <ol> 4371 <li>A paragraph 4372 with two lines.</li> 4373 </ol> 4374 ```````````````````````````````` 4375 4376 4377 These examples show how laziness can work in nested structures: 4378 4379 ```````````````````````````````` example 4380 > 1. > Blockquote 4381 continued here. 4382 . 4383 <blockquote> 4384 <ol> 4385 <li> 4386 <blockquote> 4387 <p>Blockquote 4388 continued here.</p> 4389 </blockquote> 4390 </li> 4391 </ol> 4392 </blockquote> 4393 ```````````````````````````````` 4394 4395 4396 ```````````````````````````````` example 4397 > 1. > Blockquote 4398 > continued here. 4399 . 4400 <blockquote> 4401 <ol> 4402 <li> 4403 <blockquote> 4404 <p>Blockquote 4405 continued here.</p> 4406 </blockquote> 4407 </li> 4408 </ol> 4409 </blockquote> 4410 ```````````````````````````````` 4411 4412 4413 4414 6. **That's all.** Nothing that is not counted as a list item by rules 4415 #1--5 counts as a [list item](#list-items). 4416 4417 The rules for sublists follow from the general rules above. A sublist 4418 must be indented the same number of spaces a paragraph would need to be 4419 in order to be included in the list item. 4420 4421 So, in this case we need two spaces indent: 4422 4423 ```````````````````````````````` example 4424 - foo 4425 - bar 4426 - baz 4427 - boo 4428 . 4429 <ul> 4430 <li>foo 4431 <ul> 4432 <li>bar 4433 <ul> 4434 <li>baz 4435 <ul> 4436 <li>boo</li> 4437 </ul> 4438 </li> 4439 </ul> 4440 </li> 4441 </ul> 4442 </li> 4443 </ul> 4444 ```````````````````````````````` 4445 4446 4447 One is not enough: 4448 4449 ```````````````````````````````` example 4450 - foo 4451 - bar 4452 - baz 4453 - boo 4454 . 4455 <ul> 4456 <li>foo</li> 4457 <li>bar</li> 4458 <li>baz</li> 4459 <li>boo</li> 4460 </ul> 4461 ```````````````````````````````` 4462 4463 4464 Here we need four, because the list marker is wider: 4465 4466 ```````````````````````````````` example 4467 10) foo 4468 - bar 4469 . 4470 <ol start="10"> 4471 <li>foo 4472 <ul> 4473 <li>bar</li> 4474 </ul> 4475 </li> 4476 </ol> 4477 ```````````````````````````````` 4478 4479 4480 Three is not enough: 4481 4482 ```````````````````````````````` example 4483 10) foo 4484 - bar 4485 . 4486 <ol start="10"> 4487 <li>foo</li> 4488 </ol> 4489 <ul> 4490 <li>bar</li> 4491 </ul> 4492 ```````````````````````````````` 4493 4494 4495 A list may be the first block in a list item: 4496 4497 ```````````````````````````````` example 4498 - - foo 4499 . 4500 <ul> 4501 <li> 4502 <ul> 4503 <li>foo</li> 4504 </ul> 4505 </li> 4506 </ul> 4507 ```````````````````````````````` 4508 4509 4510 ```````````````````````````````` example 4511 1. - 2. foo 4512 . 4513 <ol> 4514 <li> 4515 <ul> 4516 <li> 4517 <ol start="2"> 4518 <li>foo</li> 4519 </ol> 4520 </li> 4521 </ul> 4522 </li> 4523 </ol> 4524 ```````````````````````````````` 4525 4526 4527 A list item can contain a heading: 4528 4529 ```````````````````````````````` example 4530 - # Foo 4531 - Bar 4532 --- 4533 baz 4534 . 4535 <ul> 4536 <li> 4537 <h1>Foo</h1> 4538 </li> 4539 <li> 4540 <h2>Bar</h2> 4541 baz</li> 4542 </ul> 4543 ```````````````````````````````` 4544 4545 4546 ### Motivation 4547 4548 John Gruber's Markdown spec says the following about list items: 4549 4550 1. "List markers typically start at the left margin, but may be indented 4551 by up to three spaces. List markers must be followed by one or more 4552 spaces or a tab." 4553 4554 2. "To make lists look nice, you can wrap items with hanging indents.... 4555 But if you don't want to, you don't have to." 4556 4557 3. "List items may consist of multiple paragraphs. Each subsequent 4558 paragraph in a list item must be indented by either 4 spaces or one 4559 tab." 4560 4561 4. "It looks nice if you indent every line of the subsequent paragraphs, 4562 but here again, Markdown will allow you to be lazy." 4563 4564 5. "To put a blockquote within a list item, the blockquote's `>` 4565 delimiters need to be indented." 4566 4567 6. "To put a code block within a list item, the code block needs to be 4568 indented twice — 8 spaces or two tabs." 4569 4570 These rules specify that a paragraph under a list item must be indented 4571 four spaces (presumably, from the left margin, rather than the start of 4572 the list marker, but this is not said), and that code under a list item 4573 must be indented eight spaces instead of the usual four. They also say 4574 that a block quote must be indented, but not by how much; however, the 4575 example given has four spaces indentation. Although nothing is said 4576 about other kinds of block-level content, it is certainly reasonable to 4577 infer that *all* block elements under a list item, including other 4578 lists, must be indented four spaces. This principle has been called the 4579 *four-space rule*. 4580 4581 The four-space rule is clear and principled, and if the reference 4582 implementation `Markdown.pl` had followed it, it probably would have 4583 become the standard. However, `Markdown.pl` allowed paragraphs and 4584 sublists to start with only two spaces indentation, at least on the 4585 outer level. Worse, its behavior was inconsistent: a sublist of an 4586 outer-level list needed two spaces indentation, but a sublist of this 4587 sublist needed three spaces. It is not surprising, then, that different 4588 implementations of Markdown have developed very different rules for 4589 determining what comes under a list item. (Pandoc and python-Markdown, 4590 for example, stuck with Gruber's syntax description and the four-space 4591 rule, while discount, redcarpet, marked, PHP Markdown, and others 4592 followed `Markdown.pl`'s behavior more closely.) 4593 4594 Unfortunately, given the divergences between implementations, there 4595 is no way to give a spec for list items that will be guaranteed not 4596 to break any existing documents. However, the spec given here should 4597 correctly handle lists formatted with either the four-space rule or 4598 the more forgiving `Markdown.pl` behavior, provided they are laid out 4599 in a way that is natural for a human to read. 4600 4601 The strategy here is to let the width and indentation of the list marker 4602 determine the indentation necessary for blocks to fall under the list 4603 item, rather than having a fixed and arbitrary number. The writer can 4604 think of the body of the list item as a unit which gets indented to the 4605 right enough to fit the list marker (and any indentation on the list 4606 marker). (The laziness rule, #5, then allows continuation lines to be 4607 unindented if needed.) 4608 4609 This rule is superior, we claim, to any rule requiring a fixed level of 4610 indentation from the margin. The four-space rule is clear but 4611 unnatural. It is quite unintuitive that 4612 4613 ``` markdown 4614 - foo 4615 4616 bar 4617 4618 - baz 4619 ``` 4620 4621 should be parsed as two lists with an intervening paragraph, 4622 4623 ``` html 4624 <ul> 4625 <li>foo</li> 4626 </ul> 4627 <p>bar</p> 4628 <ul> 4629 <li>baz</li> 4630 </ul> 4631 ``` 4632 4633 as the four-space rule demands, rather than a single list, 4634 4635 ``` html 4636 <ul> 4637 <li> 4638 <p>foo</p> 4639 <p>bar</p> 4640 <ul> 4641 <li>baz</li> 4642 </ul> 4643 </li> 4644 </ul> 4645 ``` 4646 4647 The choice of four spaces is arbitrary. It can be learned, but it is 4648 not likely to be guessed, and it trips up beginners regularly. 4649 4650 Would it help to adopt a two-space rule? The problem is that such 4651 a rule, together with the rule allowing 1--3 spaces indentation of the 4652 initial list marker, allows text that is indented *less than* the 4653 original list marker to be included in the list item. For example, 4654 `Markdown.pl` parses 4655 4656 ``` markdown 4657 - one 4658 4659 two 4660 ``` 4661 4662 as a single list item, with `two` a continuation paragraph: 4663 4664 ``` html 4665 <ul> 4666 <li> 4667 <p>one</p> 4668 <p>two</p> 4669 </li> 4670 </ul> 4671 ``` 4672 4673 and similarly 4674 4675 ``` markdown 4676 > - one 4677 > 4678 > two 4679 ``` 4680 4681 as 4682 4683 ``` html 4684 <blockquote> 4685 <ul> 4686 <li> 4687 <p>one</p> 4688 <p>two</p> 4689 </li> 4690 </ul> 4691 </blockquote> 4692 ``` 4693 4694 This is extremely unintuitive. 4695 4696 Rather than requiring a fixed indent from the margin, we could require 4697 a fixed indent (say, two spaces, or even one space) from the list marker (which 4698 may itself be indented). This proposal would remove the last anomaly 4699 discussed. Unlike the spec presented above, it would count the following 4700 as a list item with a subparagraph, even though the paragraph `bar` 4701 is not indented as far as the first paragraph `foo`: 4702 4703 ``` markdown 4704 10. foo 4705 4706 bar 4707 ``` 4708 4709 Arguably this text does read like a list item with `bar` as a subparagraph, 4710 which may count in favor of the proposal. However, on this proposal indented 4711 code would have to be indented six spaces after the list marker. And this 4712 would break a lot of existing Markdown, which has the pattern: 4713 4714 ``` markdown 4715 1. foo 4716 4717 indented code 4718 ``` 4719 4720 where the code is indented eight spaces. The spec above, by contrast, will 4721 parse this text as expected, since the code block's indentation is measured 4722 from the beginning of `foo`. 4723 4724 The one case that needs special treatment is a list item that *starts* 4725 with indented code. How much indentation is required in that case, since 4726 we don't have a "first paragraph" to measure from? Rule #2 simply stipulates 4727 that in such cases, we require one space indentation from the list marker 4728 (and then the normal four spaces for the indented code). This will match the 4729 four-space rule in cases where the list marker plus its initial indentation 4730 takes four spaces (a common case), but diverge in other cases. 4731 4732 ## Lists 4733 4734 A [list](@) is a sequence of one or more 4735 list items [of the same type]. The list items 4736 may be separated by any number of blank lines. 4737 4738 Two list items are [of the same type](@) 4739 if they begin with a [list marker] of the same type. 4740 Two list markers are of the 4741 same type if (a) they are bullet list markers using the same character 4742 (`-`, `+`, or `*`) or (b) they are ordered list numbers with the same 4743 delimiter (either `.` or `)`). 4744 4745 A list is an [ordered list](@) 4746 if its constituent list items begin with 4747 [ordered list markers], and a 4748 [bullet list](@) if its constituent list 4749 items begin with [bullet list markers]. 4750 4751 The [start number](@) 4752 of an [ordered list] is determined by the list number of 4753 its initial list item. The numbers of subsequent list items are 4754 disregarded. 4755 4756 A list is [loose](@) if any of its constituent 4757 list items are separated by blank lines, or if any of its constituent 4758 list items directly contain two block-level elements with a blank line 4759 between them. Otherwise a list is [tight](@). 4760 (The difference in HTML output is that paragraphs in a loose list are 4761 wrapped in `<p>` tags, while paragraphs in a tight list are not.) 4762 4763 Changing the bullet or ordered list delimiter starts a new list: 4764 4765 ```````````````````````````````` example 4766 - foo 4767 - bar 4768 + baz 4769 . 4770 <ul> 4771 <li>foo</li> 4772 <li>bar</li> 4773 </ul> 4774 <ul> 4775 <li>baz</li> 4776 </ul> 4777 ```````````````````````````````` 4778 4779 4780 ```````````````````````````````` example 4781 1. foo 4782 2. bar 4783 3) baz 4784 . 4785 <ol> 4786 <li>foo</li> 4787 <li>bar</li> 4788 </ol> 4789 <ol start="3"> 4790 <li>baz</li> 4791 </ol> 4792 ```````````````````````````````` 4793 4794 4795 In CommonMark, a list can interrupt a paragraph. That is, 4796 no blank line is needed to separate a paragraph from a following 4797 list: 4798 4799 ```````````````````````````````` example 4800 Foo 4801 - bar 4802 - baz 4803 . 4804 <p>Foo</p> 4805 <ul> 4806 <li>bar</li> 4807 <li>baz</li> 4808 </ul> 4809 ```````````````````````````````` 4810 4811 `Markdown.pl` does not allow this, through fear of triggering a list 4812 via a numeral in a hard-wrapped line: 4813 4814 ``` markdown 4815 The number of windows in my house is 4816 14. The number of doors is 6. 4817 ``` 4818 4819 Oddly, though, `Markdown.pl` *does* allow a blockquote to 4820 interrupt a paragraph, even though the same considerations might 4821 apply. 4822 4823 In CommonMark, we do allow lists to interrupt paragraphs, for 4824 two reasons. First, it is natural and not uncommon for people 4825 to start lists without blank lines: 4826 4827 ``` markdown 4828 I need to buy 4829 - new shoes 4830 - a coat 4831 - a plane ticket 4832 ``` 4833 4834 Second, we are attracted to a 4835 4836 > [principle of uniformity](@): 4837 > if a chunk of text has a certain 4838 > meaning, it will continue to have the same meaning when put into a 4839 > container block (such as a list item or blockquote). 4840 4841 (Indeed, the spec for [list items] and [block quotes] presupposes 4842 this principle.) This principle implies that if 4843 4844 ``` markdown 4845 * I need to buy 4846 - new shoes 4847 - a coat 4848 - a plane ticket 4849 ``` 4850 4851 is a list item containing a paragraph followed by a nested sublist, 4852 as all Markdown implementations agree it is (though the paragraph 4853 may be rendered without `<p>` tags, since the list is "tight"), 4854 then 4855 4856 ``` markdown 4857 I need to buy 4858 - new shoes 4859 - a coat 4860 - a plane ticket 4861 ``` 4862 4863 by itself should be a paragraph followed by a nested sublist. 4864 4865 Since it is well established Markdown practice to allow lists to 4866 interrupt paragraphs inside list items, the [principle of 4867 uniformity] requires us to allow this outside list items as 4868 well. ([reStructuredText](http://docutils.sourceforge.net/rst.html) 4869 takes a different approach, requiring blank lines before lists 4870 even inside other list items.) 4871 4872 In order to solve of unwanted lists in paragraphs with 4873 hard-wrapped numerals, we allow only lists starting with `1` to 4874 interrupt paragraphs. Thus, 4875 4876 ```````````````````````````````` example 4877 The number of windows in my house is 4878 14. The number of doors is 6. 4879 . 4880 <p>The number of windows in my house is 4881 14. The number of doors is 6.</p> 4882 ```````````````````````````````` 4883 4884 We may still get an unintended result in cases like 4885 4886 ```````````````````````````````` example 4887 The number of windows in my house is 4888 1. The number of doors is 6. 4889 . 4890 <p>The number of windows in my house is</p> 4891 <ol> 4892 <li>The number of doors is 6.</li> 4893 </ol> 4894 ```````````````````````````````` 4895 4896 but this rule should prevent most spurious list captures. 4897 4898 There can be any number of blank lines between items: 4899 4900 ```````````````````````````````` example 4901 - foo 4902 4903 - bar 4904 4905 4906 - baz 4907 . 4908 <ul> 4909 <li> 4910 <p>foo</p> 4911 </li> 4912 <li> 4913 <p>bar</p> 4914 </li> 4915 <li> 4916 <p>baz</p> 4917 </li> 4918 </ul> 4919 ```````````````````````````````` 4920 4921 ```````````````````````````````` example 4922 - foo 4923 - bar 4924 - baz 4925 4926 4927 bim 4928 . 4929 <ul> 4930 <li>foo 4931 <ul> 4932 <li>bar 4933 <ul> 4934 <li> 4935 <p>baz</p> 4936 <p>bim</p> 4937 </li> 4938 </ul> 4939 </li> 4940 </ul> 4941 </li> 4942 </ul> 4943 ```````````````````````````````` 4944 4945 4946 To separate consecutive lists of the same type, or to separate a 4947 list from an indented code block that would otherwise be parsed 4948 as a subparagraph of the final list item, you can insert a blank HTML 4949 comment: 4950 4951 ```````````````````````````````` example 4952 - foo 4953 - bar 4954 4955 <!-- --> 4956 4957 - baz 4958 - bim 4959 . 4960 <ul> 4961 <li>foo</li> 4962 <li>bar</li> 4963 </ul> 4964 <!-- --> 4965 <ul> 4966 <li>baz</li> 4967 <li>bim</li> 4968 </ul> 4969 ```````````````````````````````` 4970 4971 4972 ```````````````````````````````` example 4973 - foo 4974 4975 notcode 4976 4977 - foo 4978 4979 <!-- --> 4980 4981 code 4982 . 4983 <ul> 4984 <li> 4985 <p>foo</p> 4986 <p>notcode</p> 4987 </li> 4988 <li> 4989 <p>foo</p> 4990 </li> 4991 </ul> 4992 <!-- --> 4993 <pre><code>code 4994 </code></pre> 4995 ```````````````````````````````` 4996 4997 4998 List items need not be indented to the same level. The following 4999 list items will be treated as items at the same list level, 5000 since none is indented enough to belong to the previous list 5001 item: 5002 5003 ```````````````````````````````` example 5004 - a 5005 - b 5006 - c 5007 - d 5008 - e 5009 - f 5010 - g 5011 - h 5012 - i 5013 . 5014 <ul> 5015 <li>a</li> 5016 <li>b</li> 5017 <li>c</li> 5018 <li>d</li> 5019 <li>e</li> 5020 <li>f</li> 5021 <li>g</li> 5022 <li>h</li> 5023 <li>i</li> 5024 </ul> 5025 ```````````````````````````````` 5026 5027 5028 ```````````````````````````````` example 5029 1. a 5030 5031 2. b 5032 5033 3. c 5034 . 5035 <ol> 5036 <li> 5037 <p>a</p> 5038 </li> 5039 <li> 5040 <p>b</p> 5041 </li> 5042 <li> 5043 <p>c</p> 5044 </li> 5045 </ol> 5046 ```````````````````````````````` 5047 5048 5049 This is a loose list, because there is a blank line between 5050 two of the list items: 5051 5052 ```````````````````````````````` example 5053 - a 5054 - b 5055 5056 - c 5057 . 5058 <ul> 5059 <li> 5060 <p>a</p> 5061 </li> 5062 <li> 5063 <p>b</p> 5064 </li> 5065 <li> 5066 <p>c</p> 5067 </li> 5068 </ul> 5069 ```````````````````````````````` 5070 5071 5072 So is this, with a empty second item: 5073 5074 ```````````````````````````````` example 5075 * a 5076 * 5077 5078 * c 5079 . 5080 <ul> 5081 <li> 5082 <p>a</p> 5083 </li> 5084 <li></li> 5085 <li> 5086 <p>c</p> 5087 </li> 5088 </ul> 5089 ```````````````````````````````` 5090 5091 5092 These are loose lists, even though there is no space between the items, 5093 because one of the items directly contains two block-level elements 5094 with a blank line between them: 5095 5096 ```````````````````````````````` example 5097 - a 5098 - b 5099 5100 c 5101 - d 5102 . 5103 <ul> 5104 <li> 5105 <p>a</p> 5106 </li> 5107 <li> 5108 <p>b</p> 5109 <p>c</p> 5110 </li> 5111 <li> 5112 <p>d</p> 5113 </li> 5114 </ul> 5115 ```````````````````````````````` 5116 5117 5118 ```````````````````````````````` example 5119 - a 5120 - b 5121 5122 [ref]: /url 5123 - d 5124 . 5125 <ul> 5126 <li> 5127 <p>a</p> 5128 </li> 5129 <li> 5130 <p>b</p> 5131 </li> 5132 <li> 5133 <p>d</p> 5134 </li> 5135 </ul> 5136 ```````````````````````````````` 5137 5138 5139 This is a tight list, because the blank lines are in a code block: 5140 5141 ```````````````````````````````` example 5142 - a 5143 - ``` 5144 b 5145 5146 5147 ``` 5148 - c 5149 . 5150 <ul> 5151 <li>a</li> 5152 <li> 5153 <pre><code>b 5154 5155 5156 </code></pre> 5157 </li> 5158 <li>c</li> 5159 </ul> 5160 ```````````````````````````````` 5161 5162 5163 This is a tight list, because the blank line is between two 5164 paragraphs of a sublist. So the sublist is loose while 5165 the outer list is tight: 5166 5167 ```````````````````````````````` example 5168 - a 5169 - b 5170 5171 c 5172 - d 5173 . 5174 <ul> 5175 <li>a 5176 <ul> 5177 <li> 5178 <p>b</p> 5179 <p>c</p> 5180 </li> 5181 </ul> 5182 </li> 5183 <li>d</li> 5184 </ul> 5185 ```````````````````````````````` 5186 5187 5188 This is a tight list, because the blank line is inside the 5189 block quote: 5190 5191 ```````````````````````````````` example 5192 * a 5193 > b 5194 > 5195 * c 5196 . 5197 <ul> 5198 <li>a 5199 <blockquote> 5200 <p>b</p> 5201 </blockquote> 5202 </li> 5203 <li>c</li> 5204 </ul> 5205 ```````````````````````````````` 5206 5207 5208 This list is tight, because the consecutive block elements 5209 are not separated by blank lines: 5210 5211 ```````````````````````````````` example 5212 - a 5213 > b 5214 ``` 5215 c 5216 ``` 5217 - d 5218 . 5219 <ul> 5220 <li>a 5221 <blockquote> 5222 <p>b</p> 5223 </blockquote> 5224 <pre><code>c 5225 </code></pre> 5226 </li> 5227 <li>d</li> 5228 </ul> 5229 ```````````````````````````````` 5230 5231 5232 A single-paragraph list is tight: 5233 5234 ```````````````````````````````` example 5235 - a 5236 . 5237 <ul> 5238 <li>a</li> 5239 </ul> 5240 ```````````````````````````````` 5241 5242 5243 ```````````````````````````````` example 5244 - a 5245 - b 5246 . 5247 <ul> 5248 <li>a 5249 <ul> 5250 <li>b</li> 5251 </ul> 5252 </li> 5253 </ul> 5254 ```````````````````````````````` 5255 5256 5257 This list is loose, because of the blank line between the 5258 two block elements in the list item: 5259 5260 ```````````````````````````````` example 5261 1. ``` 5262 foo 5263 ``` 5264 5265 bar 5266 . 5267 <ol> 5268 <li> 5269 <pre><code>foo 5270 </code></pre> 5271 <p>bar</p> 5272 </li> 5273 </ol> 5274 ```````````````````````````````` 5275 5276 5277 Here the outer list is loose, the inner list tight: 5278 5279 ```````````````````````````````` example 5280 * foo 5281 * bar 5282 5283 baz 5284 . 5285 <ul> 5286 <li> 5287 <p>foo</p> 5288 <ul> 5289 <li>bar</li> 5290 </ul> 5291 <p>baz</p> 5292 </li> 5293 </ul> 5294 ```````````````````````````````` 5295 5296 5297 ```````````````````````````````` example 5298 - a 5299 - b 5300 - c 5301 5302 - d 5303 - e 5304 - f 5305 . 5306 <ul> 5307 <li> 5308 <p>a</p> 5309 <ul> 5310 <li>b</li> 5311 <li>c</li> 5312 </ul> 5313 </li> 5314 <li> 5315 <p>d</p> 5316 <ul> 5317 <li>e</li> 5318 <li>f</li> 5319 </ul> 5320 </li> 5321 </ul> 5322 ```````````````````````````````` 5323 5324 5325 # Inlines 5326 5327 Inlines are parsed sequentially from the beginning of the character 5328 stream to the end (left to right, in left-to-right languages). 5329 Thus, for example, in 5330 5331 ```````````````````````````````` example 5332 `hi`lo` 5333 . 5334 <p><code>hi</code>lo`</p> 5335 ```````````````````````````````` 5336 5337 5338 `hi` is parsed as code, leaving the backtick at the end as a literal 5339 backtick. 5340 5341 ## Backslash escapes 5342 5343 Any ASCII punctuation character may be backslash-escaped: 5344 5345 ```````````````````````````````` example 5346 \!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~ 5347 . 5348 <p>!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~</p> 5349 ```````````````````````````````` 5350 5351 5352 Backslashes before other characters are treated as literal 5353 backslashes: 5354 5355 ```````````````````````````````` example 5356 \→\A\a\ \3\φ\« 5357 . 5358 <p>\→\A\a\ \3\φ\«</p> 5359 ```````````````````````````````` 5360 5361 5362 Escaped characters are treated as regular characters and do 5363 not have their usual Markdown meanings: 5364 5365 ```````````````````````````````` example 5366 \*not emphasized* 5367 \<br/> not a tag 5368 \[not a link](/foo) 5369 \`not code` 5370 1\. not a list 5371 \* not a list 5372 \# not a heading 5373 \[foo]: /url "not a reference" 5374 . 5375 <p>*not emphasized* 5376 <br/> not a tag 5377 [not a link](/foo) 5378 `not code` 5379 1. not a list 5380 * not a list 5381 # not a heading 5382 [foo]: /url "not a reference"</p> 5383 ```````````````````````````````` 5384 5385 5386 If a backslash is itself escaped, the following character is not: 5387 5388 ```````````````````````````````` example 5389 \\*emphasis* 5390 . 5391 <p>\<em>emphasis</em></p> 5392 ```````````````````````````````` 5393 5394 5395 A backslash at the end of the line is a [hard line break]: 5396 5397 ```````````````````````````````` example 5398 foo\ 5399 bar 5400 . 5401 <p>foo<br /> 5402 bar</p> 5403 ```````````````````````````````` 5404 5405 5406 Backslash escapes do not work in code blocks, code spans, autolinks, or 5407 raw HTML: 5408 5409 ```````````````````````````````` example 5410 `` \[\` `` 5411 . 5412 <p><code>\[\`</code></p> 5413 ```````````````````````````````` 5414 5415 5416 ```````````````````````````````` example 5417 \[\] 5418 . 5419 <pre><code>\[\] 5420 </code></pre> 5421 ```````````````````````````````` 5422 5423 5424 ```````````````````````````````` example 5425 ~~~ 5426 \[\] 5427 ~~~ 5428 . 5429 <pre><code>\[\] 5430 </code></pre> 5431 ```````````````````````````````` 5432 5433 5434 ```````````````````````````````` example 5435 <http://example.com?find=\*> 5436 . 5437 <p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p> 5438 ```````````````````````````````` 5439 5440 5441 ```````````````````````````````` example 5442 <a href="/bar\/)"> 5443 . 5444 <a href="/bar\/)"> 5445 ```````````````````````````````` 5446 5447 5448 But they work in all other contexts, including URLs and link titles, 5449 link references, and [info strings] in [fenced code blocks]: 5450 5451 ```````````````````````````````` example 5452 [foo](/bar\* "ti\*tle") 5453 . 5454 <p><a href="/bar*" title="ti*tle">foo</a></p> 5455 ```````````````````````````````` 5456 5457 5458 ```````````````````````````````` example 5459 [foo] 5460 5461 [foo]: /bar\* "ti\*tle" 5462 . 5463 <p><a href="/bar*" title="ti*tle">foo</a></p> 5464 ```````````````````````````````` 5465 5466 5467 ```````````````````````````````` example 5468 ``` foo\+bar 5469 foo 5470 ``` 5471 . 5472 <pre><code class="language-foo+bar">foo 5473 </code></pre> 5474 ```````````````````````````````` 5475 5476 5477 5478 ## Entity and numeric character references 5479 5480 All valid HTML entity references and numeric character 5481 references, except those occuring in code blocks and code spans, 5482 are recognized as such and treated as equivalent to the 5483 corresponding Unicode characters. Conforming CommonMark parsers 5484 need not store information about whether a particular character 5485 was represented in the source using a Unicode character or 5486 an entity reference. 5487 5488 [Entity references](@) consist of `&` + any of the valid 5489 HTML5 entity names + `;`. The 5490 document <https://html.spec.whatwg.org/multipage/entities.json> 5491 is used as an authoritative source for the valid entity 5492 references and their corresponding code points. 5493 5494 ```````````````````````````````` example 5495 & © Æ Ď 5496 ¾ ℋ ⅆ 5497 ∲ ≧̸ 5498 . 5499 <p> & © Æ Ď 5500 ¾ ℋ ⅆ 5501 ∲ ≧̸</p> 5502 ```````````````````````````````` 5503 5504 5505 [Decimal numeric character 5506 references](@) 5507 consist of `&#` + a string of 1--8 arabic digits + `;`. A 5508 numeric character reference is parsed as the corresponding 5509 Unicode character. Invalid Unicode code points will be replaced by 5510 the REPLACEMENT CHARACTER (`U+FFFD`). For security reasons, 5511 the code point `U+0000` will also be replaced by `U+FFFD`. 5512 5513 ```````````````````````````````` example 5514 # Ӓ Ϡ � � 5515 . 5516 <p># Ӓ Ϡ � �</p> 5517 ```````````````````````````````` 5518 5519 5520 [Hexadecimal numeric character 5521 references](@) consist of `&#` + 5522 either `X` or `x` + a string of 1-8 hexadecimal digits + `;`. 5523 They too are parsed as the corresponding Unicode character (this 5524 time specified with a hexadecimal numeral instead of decimal). 5525 5526 ```````````````````````````````` example 5527 " ആ ಫ 5528 . 5529 <p>" ആ ಫ</p> 5530 ```````````````````````````````` 5531 5532 5533 Here are some nonentities: 5534 5535 ```````````````````````````````` example 5536   &x; &#; &#x; 5537 &ThisIsNotDefined; &hi?; 5538 . 5539 <p>&nbsp &x; &#; &#x; 5540 &ThisIsNotDefined; &hi?;</p> 5541 ```````````````````````````````` 5542 5543 5544 Although HTML5 does accept some entity references 5545 without a trailing semicolon (such as `©`), these are not 5546 recognized here, because it makes the grammar too ambiguous: 5547 5548 ```````````````````````````````` example 5549 © 5550 . 5551 <p>&copy</p> 5552 ```````````````````````````````` 5553 5554 5555 Strings that are not on the list of HTML5 named entities are not 5556 recognized as entity references either: 5557 5558 ```````````````````````````````` example 5559 &MadeUpEntity; 5560 . 5561 <p>&MadeUpEntity;</p> 5562 ```````````````````````````````` 5563 5564 5565 Entity and numeric character references are recognized in any 5566 context besides code spans or code blocks, including 5567 URLs, [link titles], and [fenced code block][] [info strings]: 5568 5569 ```````````````````````````````` example 5570 <a href="öö.html"> 5571 . 5572 <a href="öö.html"> 5573 ```````````````````````````````` 5574 5575 5576 ```````````````````````````````` example 5577 [foo](/föö "föö") 5578 . 5579 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 5580 ```````````````````````````````` 5581 5582 5583 ```````````````````````````````` example 5584 [foo] 5585 5586 [foo]: /föö "föö" 5587 . 5588 <p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p> 5589 ```````````````````````````````` 5590 5591 5592 ```````````````````````````````` example 5593 ``` föö 5594 foo 5595 ``` 5596 . 5597 <pre><code class="language-föö">foo 5598 </code></pre> 5599 ```````````````````````````````` 5600 5601 5602 Entity and numeric character references are treated as literal 5603 text in code spans and code blocks: 5604 5605 ```````````````````````````````` example 5606 `föö` 5607 . 5608 <p><code>f&ouml;&ouml;</code></p> 5609 ```````````````````````````````` 5610 5611 5612 ```````````````````````````````` example 5613 föfö 5614 . 5615 <pre><code>f&ouml;f&ouml; 5616 </code></pre> 5617 ```````````````````````````````` 5618 5619 5620 ## Code spans 5621 5622 A [backtick string](@) 5623 is a string of one or more backtick characters (`` ` ``) that is neither 5624 preceded nor followed by a backtick. 5625 5626 A [code span](@) begins with a backtick string and ends with 5627 a backtick string of equal length. The contents of the code span are 5628 the characters between the two backtick strings, with leading and 5629 trailing spaces and [line endings] removed, and 5630 [whitespace] collapsed to single spaces. 5631 5632 This is a simple code span: 5633 5634 ```````````````````````````````` example 5635 `foo` 5636 . 5637 <p><code>foo</code></p> 5638 ```````````````````````````````` 5639 5640 5641 Here two backticks are used, because the code contains a backtick. 5642 This example also illustrates stripping of leading and trailing spaces: 5643 5644 ```````````````````````````````` example 5645 `` foo ` bar `` 5646 . 5647 <p><code>foo ` bar</code></p> 5648 ```````````````````````````````` 5649 5650 5651 This example shows the motivation for stripping leading and trailing 5652 spaces: 5653 5654 ```````````````````````````````` example 5655 ` `` ` 5656 . 5657 <p><code>``</code></p> 5658 ```````````````````````````````` 5659 5660 5661 [Line endings] are treated like spaces: 5662 5663 ```````````````````````````````` example 5664 `` 5665 foo 5666 `` 5667 . 5668 <p><code>foo</code></p> 5669 ```````````````````````````````` 5670 5671 5672 Interior spaces and [line endings] are collapsed into 5673 single spaces, just as they would be by a browser: 5674 5675 ```````````````````````````````` example 5676 `foo bar 5677 baz` 5678 . 5679 <p><code>foo bar baz</code></p> 5680 ```````````````````````````````` 5681 5682 5683 Not all [Unicode whitespace] (for instance, non-breaking space) is 5684 collapsed, however: 5685 5686 ```````````````````````````````` example 5687 `a b` 5688 . 5689 <p><code>a b</code></p> 5690 ```````````````````````````````` 5691 5692 5693 Q: Why not just leave the spaces, since browsers will collapse them 5694 anyway? A: Because we might be targeting a non-HTML format, and we 5695 shouldn't rely on HTML-specific rendering assumptions. 5696 5697 (Existing implementations differ in their treatment of internal 5698 spaces and [line endings]. Some, including `Markdown.pl` and 5699 `showdown`, convert an internal [line ending] into a 5700 `<br />` tag. But this makes things difficult for those who like to 5701 hard-wrap their paragraphs, since a line break in the midst of a code 5702 span will cause an unintended line break in the output. Others just 5703 leave internal spaces as they are, which is fine if only HTML is being 5704 targeted.) 5705 5706 ```````````````````````````````` example 5707 `foo `` bar` 5708 . 5709 <p><code>foo `` bar</code></p> 5710 ```````````````````````````````` 5711 5712 5713 Note that backslash escapes do not work in code spans. All backslashes 5714 are treated literally: 5715 5716 ```````````````````````````````` example 5717 `foo\`bar` 5718 . 5719 <p><code>foo\</code>bar`</p> 5720 ```````````````````````````````` 5721 5722 5723 Backslash escapes are never needed, because one can always choose a 5724 string of *n* backtick characters as delimiters, where the code does 5725 not contain any strings of exactly *n* backtick characters. 5726 5727 Code span backticks have higher precedence than any other inline 5728 constructs except HTML tags and autolinks. Thus, for example, this is 5729 not parsed as emphasized text, since the second `*` is part of a code 5730 span: 5731 5732 ```````````````````````````````` example 5733 *foo`*` 5734 . 5735 <p>*foo<code>*</code></p> 5736 ```````````````````````````````` 5737 5738 5739 And this is not parsed as a link: 5740 5741 ```````````````````````````````` example 5742 [not a `link](/foo`) 5743 . 5744 <p>[not a <code>link](/foo</code>)</p> 5745 ```````````````````````````````` 5746 5747 5748 Code spans, HTML tags, and autolinks have the same precedence. 5749 Thus, this is code: 5750 5751 ```````````````````````````````` example 5752 `<a href="`">` 5753 . 5754 <p><code><a href="</code>">`</p> 5755 ```````````````````````````````` 5756 5757 5758 But this is an HTML tag: 5759 5760 ```````````````````````````````` example 5761 <a href="`">` 5762 . 5763 <p><a href="`">`</p> 5764 ```````````````````````````````` 5765 5766 5767 And this is code: 5768 5769 ```````````````````````````````` example 5770 `<http://foo.bar.`baz>` 5771 . 5772 <p><code><http://foo.bar.</code>baz>`</p> 5773 ```````````````````````````````` 5774 5775 5776 But this is an autolink: 5777 5778 ```````````````````````````````` example 5779 <http://foo.bar.`baz>` 5780 . 5781 <p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p> 5782 ```````````````````````````````` 5783 5784 5785 When a backtick string is not closed by a matching backtick string, 5786 we just have literal backticks: 5787 5788 ```````````````````````````````` example 5789 ```foo`` 5790 . 5791 <p>```foo``</p> 5792 ```````````````````````````````` 5793 5794 5795 ```````````````````````````````` example 5796 `foo 5797 . 5798 <p>`foo</p> 5799 ```````````````````````````````` 5800 5801 The following case also illustrates the need for opening and 5802 closing backtick strings to be equal in length: 5803 5804 ```````````````````````````````` example 5805 `foo``bar`` 5806 . 5807 <p>`foo<code>bar</code></p> 5808 ```````````````````````````````` 5809 5810 5811 ## Emphasis and strong emphasis 5812 5813 John Gruber's original [Markdown syntax 5814 description](http://daringfireball.net/projects/markdown/syntax#em) says: 5815 5816 > Markdown treats asterisks (`*`) and underscores (`_`) as indicators of 5817 > emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML 5818 > `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>` 5819 > tag. 5820 5821 This is enough for most users, but these rules leave much undecided, 5822 especially when it comes to nested emphasis. The original 5823 `Markdown.pl` test suite makes it clear that triple `***` and 5824 `___` delimiters can be used for strong emphasis, and most 5825 implementations have also allowed the following patterns: 5826 5827 ``` markdown 5828 ***strong emph*** 5829 ***strong** in emph* 5830 ***emph* in strong** 5831 **in strong *emph*** 5832 *in emph **strong*** 5833 ``` 5834 5835 The following patterns are less widely supported, but the intent 5836 is clear and they are useful (especially in contexts like bibliography 5837 entries): 5838 5839 ``` markdown 5840 *emph *with emph* in it* 5841 **strong **with strong** in it** 5842 ``` 5843 5844 Many implementations have also restricted intraword emphasis to 5845 the `*` forms, to avoid unwanted emphasis in words containing 5846 internal underscores. (It is best practice to put these in code 5847 spans, but users often do not.) 5848 5849 ``` markdown 5850 internal emphasis: foo*bar*baz 5851 no emphasis: foo_bar_baz 5852 ``` 5853 5854 The rules given below capture all of these patterns, while allowing 5855 for efficient parsing strategies that do not backtrack. 5856 5857 First, some definitions. A [delimiter run](@) is either 5858 a sequence of one or more `*` characters that is not preceded or 5859 followed by a `*` character, or a sequence of one or more `_` 5860 characters that is not preceded or followed by a `_` character. 5861 5862 A [left-flanking delimiter run](@) is 5863 a [delimiter run] that is (a) not followed by [Unicode whitespace], 5864 and (b) not followed by a [punctuation character], or 5865 preceded by [Unicode whitespace] or a [punctuation character]. 5866 For purposes of this definition, the beginning and the end of 5867 the line count as Unicode whitespace. 5868 5869 A [right-flanking delimiter run](@) is 5870 a [delimiter run] that is (a) not preceded by [Unicode whitespace], 5871 and (b) not preceded by a [punctuation character], or 5872 followed by [Unicode whitespace] or a [punctuation character]. 5873 For purposes of this definition, the beginning and the end of 5874 the line count as Unicode whitespace. 5875 5876 Here are some examples of delimiter runs. 5877 5878 - left-flanking but not right-flanking: 5879 5880 ``` 5881 ***abc 5882 _abc 5883 **"abc" 5884 _"abc" 5885 ``` 5886 5887 - right-flanking but not left-flanking: 5888 5889 ``` 5890 abc*** 5891 abc_ 5892 "abc"** 5893 "abc"_ 5894 ``` 5895 5896 - Both left and right-flanking: 5897 5898 ``` 5899 abc***def 5900 "abc"_"def" 5901 ``` 5902 5903 - Neither left nor right-flanking: 5904 5905 ``` 5906 abc *** def 5907 a _ b 5908 ``` 5909 5910 (The idea of distinguishing left-flanking and right-flanking 5911 delimiter runs based on the character before and the character 5912 after comes from Roopesh Chander's 5913 [vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags). 5914 vfmd uses the terminology "emphasis indicator string" instead of "delimiter 5915 run," and its rules for distinguishing left- and right-flanking runs 5916 are a bit more complex than the ones given here.) 5917 5918 The following rules define emphasis and strong emphasis: 5919 5920 1. A single `*` character [can open emphasis](@) 5921 iff (if and only if) it is part of a [left-flanking delimiter run]. 5922 5923 2. A single `_` character [can open emphasis] iff 5924 it is part of a [left-flanking delimiter run] 5925 and either (a) not part of a [right-flanking delimiter run] 5926 or (b) part of a [right-flanking delimiter run] 5927 preceded by punctuation. 5928 5929 3. A single `*` character [can close emphasis](@) 5930 iff it is part of a [right-flanking delimiter run]. 5931 5932 4. A single `_` character [can close emphasis] iff 5933 it is part of a [right-flanking delimiter run] 5934 and either (a) not part of a [left-flanking delimiter run] 5935 or (b) part of a [left-flanking delimiter run] 5936 followed by punctuation. 5937 5938 5. A double `**` [can open strong emphasis](@) 5939 iff it is part of a [left-flanking delimiter run]. 5940 5941 6. A double `__` [can open strong emphasis] iff 5942 it is part of a [left-flanking delimiter run] 5943 and either (a) not part of a [right-flanking delimiter run] 5944 or (b) part of a [right-flanking delimiter run] 5945 preceded by punctuation. 5946 5947 7. A double `**` [can close strong emphasis](@) 5948 iff it is part of a [right-flanking delimiter run]. 5949 5950 8. A double `__` [can close strong emphasis] iff 5951 it is part of a [right-flanking delimiter run] 5952 and either (a) not part of a [left-flanking delimiter run] 5953 or (b) part of a [left-flanking delimiter run] 5954 followed by punctuation. 5955 5956 9. Emphasis begins with a delimiter that [can open emphasis] and ends 5957 with a delimiter that [can close emphasis], and that uses the same 5958 character (`_` or `*`) as the opening delimiter. The 5959 opening and closing delimiters must belong to separate 5960 [delimiter runs]. If one of the delimiters can both 5961 open and close emphasis, then the sum of the lengths of the 5962 delimiter runs containing the opening and closing delimiters 5963 must not be a multiple of 3. 5964 5965 10. Strong emphasis begins with a delimiter that 5966 [can open strong emphasis] and ends with a delimiter that 5967 [can close strong emphasis], and that uses the same character 5968 (`_` or `*`) as the opening delimiter. The 5969 opening and closing delimiters must belong to separate 5970 [delimiter runs]. If one of the delimiters can both open 5971 and close strong emphasis, then the sum of the lengths of 5972 the delimiter runs containing the opening and closing 5973 delimiters must not be a multiple of 3. 5974 5975 11. A literal `*` character cannot occur at the beginning or end of 5976 `*`-delimited emphasis or `**`-delimited strong emphasis, unless it 5977 is backslash-escaped. 5978 5979 12. A literal `_` character cannot occur at the beginning or end of 5980 `_`-delimited emphasis or `__`-delimited strong emphasis, unless it 5981 is backslash-escaped. 5982 5983 Where rules 1--12 above are compatible with multiple parsings, 5984 the following principles resolve ambiguity: 5985 5986 13. The number of nestings should be minimized. Thus, for example, 5987 an interpretation `<strong>...</strong>` is always preferred to 5988 `<em><em>...</em></em>`. 5989 5990 14. An interpretation `<em><strong>...</strong></em>` is always 5991 preferred to `<strong><em>...</em></strong>`. 5992 5993 15. When two potential emphasis or strong emphasis spans overlap, 5994 so that the second begins before the first ends and ends after 5995 the first ends, the first takes precedence. Thus, for example, 5996 `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather 5997 than `*foo <em>bar* baz</em>`. 5998 5999 16. When there are two potential emphasis or strong emphasis spans 6000 with the same closing delimiter, the shorter one (the one that 6001 opens later) takes precedence. Thus, for example, 6002 `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>` 6003 rather than `<strong>foo **bar baz</strong>`. 6004 6005 17. Inline code spans, links, images, and HTML tags group more tightly 6006 than emphasis. So, when there is a choice between an interpretation 6007 that contains one of these elements and one that does not, the 6008 former always wins. Thus, for example, `*[foo*](bar)` is 6009 parsed as `*<a href="bar">foo*</a>` rather than as 6010 `<em>[foo</em>](bar)`. 6011 6012 These rules can be illustrated through a series of examples. 6013 6014 Rule 1: 6015 6016 ```````````````````````````````` example 6017 *foo bar* 6018 . 6019 <p><em>foo bar</em></p> 6020 ```````````````````````````````` 6021 6022 6023 This is not emphasis, because the opening `*` is followed by 6024 whitespace, and hence not part of a [left-flanking delimiter run]: 6025 6026 ```````````````````````````````` example 6027 a * foo bar* 6028 . 6029 <p>a * foo bar*</p> 6030 ```````````````````````````````` 6031 6032 6033 This is not emphasis, because the opening `*` is preceded 6034 by an alphanumeric and followed by punctuation, and hence 6035 not part of a [left-flanking delimiter run]: 6036 6037 ```````````````````````````````` example 6038 a*"foo"* 6039 . 6040 <p>a*"foo"*</p> 6041 ```````````````````````````````` 6042 6043 6044 Unicode nonbreaking spaces count as whitespace, too: 6045 6046 ```````````````````````````````` example 6047 * a * 6048 . 6049 <p>* a *</p> 6050 ```````````````````````````````` 6051 6052 6053 Intraword emphasis with `*` is permitted: 6054 6055 ```````````````````````````````` example 6056 foo*bar* 6057 . 6058 <p>foo<em>bar</em></p> 6059 ```````````````````````````````` 6060 6061 6062 ```````````````````````````````` example 6063 5*6*78 6064 . 6065 <p>5<em>6</em>78</p> 6066 ```````````````````````````````` 6067 6068 6069 Rule 2: 6070 6071 ```````````````````````````````` example 6072 _foo bar_ 6073 . 6074 <p><em>foo bar</em></p> 6075 ```````````````````````````````` 6076 6077 6078 This is not emphasis, because the opening `_` is followed by 6079 whitespace: 6080 6081 ```````````````````````````````` example 6082 _ foo bar_ 6083 . 6084 <p>_ foo bar_</p> 6085 ```````````````````````````````` 6086 6087 6088 This is not emphasis, because the opening `_` is preceded 6089 by an alphanumeric and followed by punctuation: 6090 6091 ```````````````````````````````` example 6092 a_"foo"_ 6093 . 6094 <p>a_"foo"_</p> 6095 ```````````````````````````````` 6096 6097 6098 Emphasis with `_` is not allowed inside words: 6099 6100 ```````````````````````````````` example 6101 foo_bar_ 6102 . 6103 <p>foo_bar_</p> 6104 ```````````````````````````````` 6105 6106 6107 ```````````````````````````````` example 6108 5_6_78 6109 . 6110 <p>5_6_78</p> 6111 ```````````````````````````````` 6112 6113 6114 ```````````````````````````````` example 6115 пристаням_стремятся_ 6116 . 6117 <p>пристаням_стремятся_</p> 6118 ```````````````````````````````` 6119 6120 6121 Here `_` does not generate emphasis, because the first delimiter run 6122 is right-flanking and the second left-flanking: 6123 6124 ```````````````````````````````` example 6125 aa_"bb"_cc 6126 . 6127 <p>aa_"bb"_cc</p> 6128 ```````````````````````````````` 6129 6130 6131 This is emphasis, even though the opening delimiter is 6132 both left- and right-flanking, because it is preceded by 6133 punctuation: 6134 6135 ```````````````````````````````` example 6136 foo-_(bar)_ 6137 . 6138 <p>foo-<em>(bar)</em></p> 6139 ```````````````````````````````` 6140 6141 6142 Rule 3: 6143 6144 This is not emphasis, because the closing delimiter does 6145 not match the opening delimiter: 6146 6147 ```````````````````````````````` example 6148 _foo* 6149 . 6150 <p>_foo*</p> 6151 ```````````````````````````````` 6152 6153 6154 This is not emphasis, because the closing `*` is preceded by 6155 whitespace: 6156 6157 ```````````````````````````````` example 6158 *foo bar * 6159 . 6160 <p>*foo bar *</p> 6161 ```````````````````````````````` 6162 6163 6164 A newline also counts as whitespace: 6165 6166 ```````````````````````````````` example 6167 *foo bar 6168 * 6169 . 6170 <p>*foo bar 6171 *</p> 6172 ```````````````````````````````` 6173 6174 6175 This is not emphasis, because the second `*` is 6176 preceded by punctuation and followed by an alphanumeric 6177 (hence it is not part of a [right-flanking delimiter run]: 6178 6179 ```````````````````````````````` example 6180 *(*foo) 6181 . 6182 <p>*(*foo)</p> 6183 ```````````````````````````````` 6184 6185 6186 The point of this restriction is more easily appreciated 6187 with this example: 6188 6189 ```````````````````````````````` example 6190 *(*foo*)* 6191 . 6192 <p><em>(<em>foo</em>)</em></p> 6193 ```````````````````````````````` 6194 6195 6196 Intraword emphasis with `*` is allowed: 6197 6198 ```````````````````````````````` example 6199 *foo*bar 6200 . 6201 <p><em>foo</em>bar</p> 6202 ```````````````````````````````` 6203 6204 6205 6206 Rule 4: 6207 6208 This is not emphasis, because the closing `_` is preceded by 6209 whitespace: 6210 6211 ```````````````````````````````` example 6212 _foo bar _ 6213 . 6214 <p>_foo bar _</p> 6215 ```````````````````````````````` 6216 6217 6218 This is not emphasis, because the second `_` is 6219 preceded by punctuation and followed by an alphanumeric: 6220 6221 ```````````````````````````````` example 6222 _(_foo) 6223 . 6224 <p>_(_foo)</p> 6225 ```````````````````````````````` 6226 6227 6228 This is emphasis within emphasis: 6229 6230 ```````````````````````````````` example 6231 _(_foo_)_ 6232 . 6233 <p><em>(<em>foo</em>)</em></p> 6234 ```````````````````````````````` 6235 6236 6237 Intraword emphasis is disallowed for `_`: 6238 6239 ```````````````````````````````` example 6240 _foo_bar 6241 . 6242 <p>_foo_bar</p> 6243 ```````````````````````````````` 6244 6245 6246 ```````````````````````````````` example 6247 _пристаням_стремятся 6248 . 6249 <p>_пристаням_стремятся</p> 6250 ```````````````````````````````` 6251 6252 6253 ```````````````````````````````` example 6254 _foo_bar_baz_ 6255 . 6256 <p><em>foo_bar_baz</em></p> 6257 ```````````````````````````````` 6258 6259 6260 This is emphasis, even though the closing delimiter is 6261 both left- and right-flanking, because it is followed by 6262 punctuation: 6263 6264 ```````````````````````````````` example 6265 _(bar)_. 6266 . 6267 <p><em>(bar)</em>.</p> 6268 ```````````````````````````````` 6269 6270 6271 Rule 5: 6272 6273 ```````````````````````````````` example 6274 **foo bar** 6275 . 6276 <p><strong>foo bar</strong></p> 6277 ```````````````````````````````` 6278 6279 6280 This is not strong emphasis, because the opening delimiter is 6281 followed by whitespace: 6282 6283 ```````````````````````````````` example 6284 ** foo bar** 6285 . 6286 <p>** foo bar**</p> 6287 ```````````````````````````````` 6288 6289 6290 This is not strong emphasis, because the opening `**` is preceded 6291 by an alphanumeric and followed by punctuation, and hence 6292 not part of a [left-flanking delimiter run]: 6293 6294 ```````````````````````````````` example 6295 a**"foo"** 6296 . 6297 <p>a**"foo"**</p> 6298 ```````````````````````````````` 6299 6300 6301 Intraword strong emphasis with `**` is permitted: 6302 6303 ```````````````````````````````` example 6304 foo**bar** 6305 . 6306 <p>foo<strong>bar</strong></p> 6307 ```````````````````````````````` 6308 6309 6310 Rule 6: 6311 6312 ```````````````````````````````` example 6313 __foo bar__ 6314 . 6315 <p><strong>foo bar</strong></p> 6316 ```````````````````````````````` 6317 6318 6319 This is not strong emphasis, because the opening delimiter is 6320 followed by whitespace: 6321 6322 ```````````````````````````````` example 6323 __ foo bar__ 6324 . 6325 <p>__ foo bar__</p> 6326 ```````````````````````````````` 6327 6328 6329 A newline counts as whitespace: 6330 ```````````````````````````````` example 6331 __ 6332 foo bar__ 6333 . 6334 <p>__ 6335 foo bar__</p> 6336 ```````````````````````````````` 6337 6338 6339 This is not strong emphasis, because the opening `__` is preceded 6340 by an alphanumeric and followed by punctuation: 6341 6342 ```````````````````````````````` example 6343 a__"foo"__ 6344 . 6345 <p>a__"foo"__</p> 6346 ```````````````````````````````` 6347 6348 6349 Intraword strong emphasis is forbidden with `__`: 6350 6351 ```````````````````````````````` example 6352 foo__bar__ 6353 . 6354 <p>foo__bar__</p> 6355 ```````````````````````````````` 6356 6357 6358 ```````````````````````````````` example 6359 5__6__78 6360 . 6361 <p>5__6__78</p> 6362 ```````````````````````````````` 6363 6364 6365 ```````````````````````````````` example 6366 пристаням__стремятся__ 6367 . 6368 <p>пристаням__стремятся__</p> 6369 ```````````````````````````````` 6370 6371 6372 ```````````````````````````````` example 6373 __foo, __bar__, baz__ 6374 . 6375 <p><strong>foo, <strong>bar</strong>, baz</strong></p> 6376 ```````````````````````````````` 6377 6378 6379 This is strong emphasis, even though the opening delimiter is 6380 both left- and right-flanking, because it is preceded by 6381 punctuation: 6382 6383 ```````````````````````````````` example 6384 foo-__(bar)__ 6385 . 6386 <p>foo-<strong>(bar)</strong></p> 6387 ```````````````````````````````` 6388 6389 6390 6391 Rule 7: 6392 6393 This is not strong emphasis, because the closing delimiter is preceded 6394 by whitespace: 6395 6396 ```````````````````````````````` example 6397 **foo bar ** 6398 . 6399 <p>**foo bar **</p> 6400 ```````````````````````````````` 6401 6402 6403 (Nor can it be interpreted as an emphasized `*foo bar *`, because of 6404 Rule 11.) 6405 6406 This is not strong emphasis, because the second `**` is 6407 preceded by punctuation and followed by an alphanumeric: 6408 6409 ```````````````````````````````` example 6410 **(**foo) 6411 . 6412 <p>**(**foo)</p> 6413 ```````````````````````````````` 6414 6415 6416 The point of this restriction is more easily appreciated 6417 with these examples: 6418 6419 ```````````````````````````````` example 6420 *(**foo**)* 6421 . 6422 <p><em>(<strong>foo</strong>)</em></p> 6423 ```````````````````````````````` 6424 6425 6426 ```````````````````````````````` example 6427 **Gomphocarpus (*Gomphocarpus physocarpus*, syn. 6428 *Asclepias physocarpa*)** 6429 . 6430 <p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn. 6431 <em>Asclepias physocarpa</em>)</strong></p> 6432 ```````````````````````````````` 6433 6434 6435 ```````````````````````````````` example 6436 **foo "*bar*" foo** 6437 . 6438 <p><strong>foo "<em>bar</em>" foo</strong></p> 6439 ```````````````````````````````` 6440 6441 6442 Intraword emphasis: 6443 6444 ```````````````````````````````` example 6445 **foo**bar 6446 . 6447 <p><strong>foo</strong>bar</p> 6448 ```````````````````````````````` 6449 6450 6451 Rule 8: 6452 6453 This is not strong emphasis, because the closing delimiter is 6454 preceded by whitespace: 6455 6456 ```````````````````````````````` example 6457 __foo bar __ 6458 . 6459 <p>__foo bar __</p> 6460 ```````````````````````````````` 6461 6462 6463 This is not strong emphasis, because the second `__` is 6464 preceded by punctuation and followed by an alphanumeric: 6465 6466 ```````````````````````````````` example 6467 __(__foo) 6468 . 6469 <p>__(__foo)</p> 6470 ```````````````````````````````` 6471 6472 6473 The point of this restriction is more easily appreciated 6474 with this example: 6475 6476 ```````````````````````````````` example 6477 _(__foo__)_ 6478 . 6479 <p><em>(<strong>foo</strong>)</em></p> 6480 ```````````````````````````````` 6481 6482 6483 Intraword strong emphasis is forbidden with `__`: 6484 6485 ```````````````````````````````` example 6486 __foo__bar 6487 . 6488 <p>__foo__bar</p> 6489 ```````````````````````````````` 6490 6491 6492 ```````````````````````````````` example 6493 __пристаням__стремятся 6494 . 6495 <p>__пристаням__стремятся</p> 6496 ```````````````````````````````` 6497 6498 6499 ```````````````````````````````` example 6500 __foo__bar__baz__ 6501 . 6502 <p><strong>foo__bar__baz</strong></p> 6503 ```````````````````````````````` 6504 6505 6506 This is strong emphasis, even though the closing delimiter is 6507 both left- and right-flanking, because it is followed by 6508 punctuation: 6509 6510 ```````````````````````````````` example 6511 __(bar)__. 6512 . 6513 <p><strong>(bar)</strong>.</p> 6514 ```````````````````````````````` 6515 6516 6517 Rule 9: 6518 6519 Any nonempty sequence of inline elements can be the contents of an 6520 emphasized span. 6521 6522 ```````````````````````````````` example 6523 *foo [bar](/url)* 6524 . 6525 <p><em>foo <a href="/url">bar</a></em></p> 6526 ```````````````````````````````` 6527 6528 6529 ```````````````````````````````` example 6530 *foo 6531 bar* 6532 . 6533 <p><em>foo 6534 bar</em></p> 6535 ```````````````````````````````` 6536 6537 6538 In particular, emphasis and strong emphasis can be nested 6539 inside emphasis: 6540 6541 ```````````````````````````````` example 6542 _foo __bar__ baz_ 6543 . 6544 <p><em>foo <strong>bar</strong> baz</em></p> 6545 ```````````````````````````````` 6546 6547 6548 ```````````````````````````````` example 6549 _foo _bar_ baz_ 6550 . 6551 <p><em>foo <em>bar</em> baz</em></p> 6552 ```````````````````````````````` 6553 6554 6555 ```````````````````````````````` example 6556 __foo_ bar_ 6557 . 6558 <p><em><em>foo</em> bar</em></p> 6559 ```````````````````````````````` 6560 6561 6562 ```````````````````````````````` example 6563 *foo *bar** 6564 . 6565 <p><em>foo <em>bar</em></em></p> 6566 ```````````````````````````````` 6567 6568 6569 ```````````````````````````````` example 6570 *foo **bar** baz* 6571 . 6572 <p><em>foo <strong>bar</strong> baz</em></p> 6573 ```````````````````````````````` 6574 6575 ```````````````````````````````` example 6576 *foo**bar**baz* 6577 . 6578 <p><em>foo<strong>bar</strong>baz</em></p> 6579 ```````````````````````````````` 6580 6581 Note that in the preceding case, the interpretation 6582 6583 ``` markdown 6584 <p><em>foo</em><em>bar<em></em>baz</em></p> 6585 ``` 6586 6587 6588 is precluded by the condition that a delimiter that 6589 can both open and close (like the `*` after `foo`) 6590 cannot form emphasis if the sum of the lengths of 6591 the delimiter runs containing the opening and 6592 closing delimiters is a multiple of 3. 6593 6594 The same condition ensures that the following 6595 cases are all strong emphasis nested inside 6596 emphasis, even when the interior spaces are 6597 omitted: 6598 6599 6600 ```````````````````````````````` example 6601 ***foo** bar* 6602 . 6603 <p><em><strong>foo</strong> bar</em></p> 6604 ```````````````````````````````` 6605 6606 6607 ```````````````````````````````` example 6608 *foo **bar*** 6609 . 6610 <p><em>foo <strong>bar</strong></em></p> 6611 ```````````````````````````````` 6612 6613 6614 ```````````````````````````````` example 6615 *foo**bar*** 6616 . 6617 <p><em>foo<strong>bar</strong></em></p> 6618 ```````````````````````````````` 6619 6620 6621 Indefinite levels of nesting are possible: 6622 6623 ```````````````````````````````` example 6624 *foo **bar *baz* bim** bop* 6625 . 6626 <p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p> 6627 ```````````````````````````````` 6628 6629 6630 ```````````````````````````````` example 6631 *foo [*bar*](/url)* 6632 . 6633 <p><em>foo <a href="/url"><em>bar</em></a></em></p> 6634 ```````````````````````````````` 6635 6636 6637 There can be no empty emphasis or strong emphasis: 6638 6639 ```````````````````````````````` example 6640 ** is not an empty emphasis 6641 . 6642 <p>** is not an empty emphasis</p> 6643 ```````````````````````````````` 6644 6645 6646 ```````````````````````````````` example 6647 **** is not an empty strong emphasis 6648 . 6649 <p>**** is not an empty strong emphasis</p> 6650 ```````````````````````````````` 6651 6652 6653 6654 Rule 10: 6655 6656 Any nonempty sequence of inline elements can be the contents of an 6657 strongly emphasized span. 6658 6659 ```````````````````````````````` example 6660 **foo [bar](/url)** 6661 . 6662 <p><strong>foo <a href="/url">bar</a></strong></p> 6663 ```````````````````````````````` 6664 6665 6666 ```````````````````````````````` example 6667 **foo 6668 bar** 6669 . 6670 <p><strong>foo 6671 bar</strong></p> 6672 ```````````````````````````````` 6673 6674 6675 In particular, emphasis and strong emphasis can be nested 6676 inside strong emphasis: 6677 6678 ```````````````````````````````` example 6679 __foo _bar_ baz__ 6680 . 6681 <p><strong>foo <em>bar</em> baz</strong></p> 6682 ```````````````````````````````` 6683 6684 6685 ```````````````````````````````` example 6686 __foo __bar__ baz__ 6687 . 6688 <p><strong>foo <strong>bar</strong> baz</strong></p> 6689 ```````````````````````````````` 6690 6691 6692 ```````````````````````````````` example 6693 ____foo__ bar__ 6694 . 6695 <p><strong><strong>foo</strong> bar</strong></p> 6696 ```````````````````````````````` 6697 6698 6699 ```````````````````````````````` example 6700 **foo **bar**** 6701 . 6702 <p><strong>foo <strong>bar</strong></strong></p> 6703 ```````````````````````````````` 6704 6705 6706 ```````````````````````````````` example 6707 **foo *bar* baz** 6708 . 6709 <p><strong>foo <em>bar</em> baz</strong></p> 6710 ```````````````````````````````` 6711 6712 6713 ```````````````````````````````` example 6714 **foo*bar*baz** 6715 . 6716 <p><strong>foo<em>bar</em>baz</strong></p> 6717 ```````````````````````````````` 6718 6719 6720 ```````````````````````````````` example 6721 ***foo* bar** 6722 . 6723 <p><strong><em>foo</em> bar</strong></p> 6724 ```````````````````````````````` 6725 6726 6727 ```````````````````````````````` example 6728 **foo *bar*** 6729 . 6730 <p><strong>foo <em>bar</em></strong></p> 6731 ```````````````````````````````` 6732 6733 6734 Indefinite levels of nesting are possible: 6735 6736 ```````````````````````````````` example 6737 **foo *bar **baz** 6738 bim* bop** 6739 . 6740 <p><strong>foo <em>bar <strong>baz</strong> 6741 bim</em> bop</strong></p> 6742 ```````````````````````````````` 6743 6744 6745 ```````````````````````````````` example 6746 **foo [*bar*](/url)** 6747 . 6748 <p><strong>foo <a href="/url"><em>bar</em></a></strong></p> 6749 ```````````````````````````````` 6750 6751 6752 There can be no empty emphasis or strong emphasis: 6753 6754 ```````````````````````````````` example 6755 __ is not an empty emphasis 6756 . 6757 <p>__ is not an empty emphasis</p> 6758 ```````````````````````````````` 6759 6760 6761 ```````````````````````````````` example 6762 ____ is not an empty strong emphasis 6763 . 6764 <p>____ is not an empty strong emphasis</p> 6765 ```````````````````````````````` 6766 6767 6768 6769 Rule 11: 6770 6771 ```````````````````````````````` example 6772 foo *** 6773 . 6774 <p>foo ***</p> 6775 ```````````````````````````````` 6776 6777 6778 ```````````````````````````````` example 6779 foo *\** 6780 . 6781 <p>foo <em>*</em></p> 6782 ```````````````````````````````` 6783 6784 6785 ```````````````````````````````` example 6786 foo *_* 6787 . 6788 <p>foo <em>_</em></p> 6789 ```````````````````````````````` 6790 6791 6792 ```````````````````````````````` example 6793 foo ***** 6794 . 6795 <p>foo *****</p> 6796 ```````````````````````````````` 6797 6798 6799 ```````````````````````````````` example 6800 foo **\*** 6801 . 6802 <p>foo <strong>*</strong></p> 6803 ```````````````````````````````` 6804 6805 6806 ```````````````````````````````` example 6807 foo **_** 6808 . 6809 <p>foo <strong>_</strong></p> 6810 ```````````````````````````````` 6811 6812 6813 Note that when delimiters do not match evenly, Rule 11 determines 6814 that the excess literal `*` characters will appear outside of the 6815 emphasis, rather than inside it: 6816 6817 ```````````````````````````````` example 6818 **foo* 6819 . 6820 <p>*<em>foo</em></p> 6821 ```````````````````````````````` 6822 6823 6824 ```````````````````````````````` example 6825 *foo** 6826 . 6827 <p><em>foo</em>*</p> 6828 ```````````````````````````````` 6829 6830 6831 ```````````````````````````````` example 6832 ***foo** 6833 . 6834 <p>*<strong>foo</strong></p> 6835 ```````````````````````````````` 6836 6837 6838 ```````````````````````````````` example 6839 ****foo* 6840 . 6841 <p>***<em>foo</em></p> 6842 ```````````````````````````````` 6843 6844 6845 ```````````````````````````````` example 6846 **foo*** 6847 . 6848 <p><strong>foo</strong>*</p> 6849 ```````````````````````````````` 6850 6851 6852 ```````````````````````````````` example 6853 *foo**** 6854 . 6855 <p><em>foo</em>***</p> 6856 ```````````````````````````````` 6857 6858 6859 6860 Rule 12: 6861 6862 ```````````````````````````````` example 6863 foo ___ 6864 . 6865 <p>foo ___</p> 6866 ```````````````````````````````` 6867 6868 6869 ```````````````````````````````` example 6870 foo _\__ 6871 . 6872 <p>foo <em>_</em></p> 6873 ```````````````````````````````` 6874 6875 6876 ```````````````````````````````` example 6877 foo _*_ 6878 . 6879 <p>foo <em>*</em></p> 6880 ```````````````````````````````` 6881 6882 6883 ```````````````````````````````` example 6884 foo _____ 6885 . 6886 <p>foo _____</p> 6887 ```````````````````````````````` 6888 6889 6890 ```````````````````````````````` example 6891 foo __\___ 6892 . 6893 <p>foo <strong>_</strong></p> 6894 ```````````````````````````````` 6895 6896 6897 ```````````````````````````````` example 6898 foo __*__ 6899 . 6900 <p>foo <strong>*</strong></p> 6901 ```````````````````````````````` 6902 6903 6904 ```````````````````````````````` example 6905 __foo_ 6906 . 6907 <p>_<em>foo</em></p> 6908 ```````````````````````````````` 6909 6910 6911 Note that when delimiters do not match evenly, Rule 12 determines 6912 that the excess literal `_` characters will appear outside of the 6913 emphasis, rather than inside it: 6914 6915 ```````````````````````````````` example 6916 _foo__ 6917 . 6918 <p><em>foo</em>_</p> 6919 ```````````````````````````````` 6920 6921 6922 ```````````````````````````````` example 6923 ___foo__ 6924 . 6925 <p>_<strong>foo</strong></p> 6926 ```````````````````````````````` 6927 6928 6929 ```````````````````````````````` example 6930 ____foo_ 6931 . 6932 <p>___<em>foo</em></p> 6933 ```````````````````````````````` 6934 6935 6936 ```````````````````````````````` example 6937 __foo___ 6938 . 6939 <p><strong>foo</strong>_</p> 6940 ```````````````````````````````` 6941 6942 6943 ```````````````````````````````` example 6944 _foo____ 6945 . 6946 <p><em>foo</em>___</p> 6947 ```````````````````````````````` 6948 6949 6950 Rule 13 implies that if you want emphasis nested directly inside 6951 emphasis, you must use different delimiters: 6952 6953 ```````````````````````````````` example 6954 **foo** 6955 . 6956 <p><strong>foo</strong></p> 6957 ```````````````````````````````` 6958 6959 6960 ```````````````````````````````` example 6961 *_foo_* 6962 . 6963 <p><em><em>foo</em></em></p> 6964 ```````````````````````````````` 6965 6966 6967 ```````````````````````````````` example 6968 __foo__ 6969 . 6970 <p><strong>foo</strong></p> 6971 ```````````````````````````````` 6972 6973 6974 ```````````````````````````````` example 6975 _*foo*_ 6976 . 6977 <p><em><em>foo</em></em></p> 6978 ```````````````````````````````` 6979 6980 6981 However, strong emphasis within strong emphasis is possible without 6982 switching delimiters: 6983 6984 ```````````````````````````````` example 6985 ****foo**** 6986 . 6987 <p><strong><strong>foo</strong></strong></p> 6988 ```````````````````````````````` 6989 6990 6991 ```````````````````````````````` example 6992 ____foo____ 6993 . 6994 <p><strong><strong>foo</strong></strong></p> 6995 ```````````````````````````````` 6996 6997 6998 6999 Rule 13 can be applied to arbitrarily long sequences of 7000 delimiters: 7001 7002 ```````````````````````````````` example 7003 ******foo****** 7004 . 7005 <p><strong><strong><strong>foo</strong></strong></strong></p> 7006 ```````````````````````````````` 7007 7008 7009 Rule 14: 7010 7011 ```````````````````````````````` example 7012 ***foo*** 7013 . 7014 <p><em><strong>foo</strong></em></p> 7015 ```````````````````````````````` 7016 7017 7018 ```````````````````````````````` example 7019 _____foo_____ 7020 . 7021 <p><em><strong><strong>foo</strong></strong></em></p> 7022 ```````````````````````````````` 7023 7024 7025 Rule 15: 7026 7027 ```````````````````````````````` example 7028 *foo _bar* baz_ 7029 . 7030 <p><em>foo _bar</em> baz_</p> 7031 ```````````````````````````````` 7032 7033 7034 ```````````````````````````````` example 7035 *foo __bar *baz bim__ bam* 7036 . 7037 <p><em>foo <strong>bar *baz bim</strong> bam</em></p> 7038 ```````````````````````````````` 7039 7040 7041 Rule 16: 7042 7043 ```````````````````````````````` example 7044 **foo **bar baz** 7045 . 7046 <p>**foo <strong>bar baz</strong></p> 7047 ```````````````````````````````` 7048 7049 7050 ```````````````````````````````` example 7051 *foo *bar baz* 7052 . 7053 <p>*foo <em>bar baz</em></p> 7054 ```````````````````````````````` 7055 7056 7057 Rule 17: 7058 7059 ```````````````````````````````` example 7060 *[bar*](/url) 7061 . 7062 <p>*<a href="/url">bar*</a></p> 7063 ```````````````````````````````` 7064 7065 7066 ```````````````````````````````` example 7067 _foo [bar_](/url) 7068 . 7069 <p>_foo <a href="/url">bar_</a></p> 7070 ```````````````````````````````` 7071 7072 7073 ```````````````````````````````` example 7074 *<img src="foo" title="*"/> 7075 . 7076 <p>*<img src="foo" title="*"/></p> 7077 ```````````````````````````````` 7078 7079 7080 ```````````````````````````````` example 7081 **<a href="**"> 7082 . 7083 <p>**<a href="**"></p> 7084 ```````````````````````````````` 7085 7086 7087 ```````````````````````````````` example 7088 __<a href="__"> 7089 . 7090 <p>__<a href="__"></p> 7091 ```````````````````````````````` 7092 7093 7094 ```````````````````````````````` example 7095 *a `*`* 7096 . 7097 <p><em>a <code>*</code></em></p> 7098 ```````````````````````````````` 7099 7100 7101 ```````````````````````````````` example 7102 _a `_`_ 7103 . 7104 <p><em>a <code>_</code></em></p> 7105 ```````````````````````````````` 7106 7107 7108 ```````````````````````````````` example 7109 **a<http://foo.bar/?q=**> 7110 . 7111 <p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p> 7112 ```````````````````````````````` 7113 7114 7115 ```````````````````````````````` example 7116 __a<http://foo.bar/?q=__> 7117 . 7118 <p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p> 7119 ```````````````````````````````` 7120 7121 7122 7123 ## Links 7124 7125 A link contains [link text] (the visible text), a [link destination] 7126 (the URI that is the link destination), and optionally a [link title]. 7127 There are two basic kinds of links in Markdown. In [inline links] the 7128 destination and title are given immediately after the link text. In 7129 [reference links] the destination and title are defined elsewhere in 7130 the document. 7131 7132 A [link text](@) consists of a sequence of zero or more 7133 inline elements enclosed by square brackets (`[` and `]`). The 7134 following rules apply: 7135 7136 - Links may not contain other links, at any level of nesting. If 7137 multiple otherwise valid link definitions appear nested inside each 7138 other, the inner-most definition is used. 7139 7140 - Brackets are allowed in the [link text] only if (a) they 7141 are backslash-escaped or (b) they appear as a matched pair of brackets, 7142 with an open bracket `[`, a sequence of zero or more inlines, and 7143 a close bracket `]`. 7144 7145 - Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly 7146 than the brackets in link text. Thus, for example, 7147 `` [foo`]` `` could not be a link text, since the second `]` 7148 is part of a code span. 7149 7150 - The brackets in link text bind more tightly than markers for 7151 [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link. 7152 7153 A [link destination](@) consists of either 7154 7155 - a sequence of zero or more characters between an opening `<` and a 7156 closing `>` that contains no spaces, line breaks, or unescaped 7157 `<` or `>` characters, or 7158 7159 - a nonempty sequence of characters that does not include 7160 ASCII space or control characters, and includes parentheses 7161 only if (a) they are backslash-escaped or (b) they are part of 7162 a balanced pair of unescaped parentheses. 7163 7164 A [link title](@) consists of either 7165 7166 - a sequence of zero or more characters between straight double-quote 7167 characters (`"`), including a `"` character only if it is 7168 backslash-escaped, or 7169 7170 - a sequence of zero or more characters between straight single-quote 7171 characters (`'`), including a `'` character only if it is 7172 backslash-escaped, or 7173 7174 - a sequence of zero or more characters between matching parentheses 7175 (`(...)`), including a `)` character only if it is backslash-escaped. 7176 7177 Although [link titles] may span multiple lines, they may not contain 7178 a [blank line]. 7179 7180 An [inline link](@) consists of a [link text] followed immediately 7181 by a left parenthesis `(`, optional [whitespace], an optional 7182 [link destination], an optional [link title] separated from the link 7183 destination by [whitespace], optional [whitespace], and a right 7184 parenthesis `)`. The link's text consists of the inlines contained 7185 in the [link text] (excluding the enclosing square brackets). 7186 The link's URI consists of the link destination, excluding enclosing 7187 `<...>` if present, with backslash-escapes in effect as described 7188 above. The link's title consists of the link title, excluding its 7189 enclosing delimiters, with backslash-escapes in effect as described 7190 above. 7191 7192 Here is a simple inline link: 7193 7194 ```````````````````````````````` example 7195 [link](/uri "title") 7196 . 7197 <p><a href="/uri" title="title">link</a></p> 7198 ```````````````````````````````` 7199 7200 7201 The title may be omitted: 7202 7203 ```````````````````````````````` example 7204 [link](/uri) 7205 . 7206 <p><a href="/uri">link</a></p> 7207 ```````````````````````````````` 7208 7209 7210 Both the title and the destination may be omitted: 7211 7212 ```````````````````````````````` example 7213 [link]() 7214 . 7215 <p><a href="">link</a></p> 7216 ```````````````````````````````` 7217 7218 7219 ```````````````````````````````` example 7220 [link](<>) 7221 . 7222 <p><a href="">link</a></p> 7223 ```````````````````````````````` 7224 7225 7226 The destination cannot contain spaces or line breaks, 7227 even if enclosed in pointy brackets: 7228 7229 ```````````````````````````````` example 7230 [link](/my uri) 7231 . 7232 <p>[link](/my uri)</p> 7233 ```````````````````````````````` 7234 7235 7236 ```````````````````````````````` example 7237 [link](</my uri>) 7238 . 7239 <p>[link](</my uri>)</p> 7240 ```````````````````````````````` 7241 7242 7243 ```````````````````````````````` example 7244 [link](foo 7245 bar) 7246 . 7247 <p>[link](foo 7248 bar)</p> 7249 ```````````````````````````````` 7250 7251 7252 ```````````````````````````````` example 7253 [link](<foo 7254 bar>) 7255 . 7256 <p>[link](<foo 7257 bar>)</p> 7258 ```````````````````````````````` 7259 7260 Parentheses inside the link destination may be escaped: 7261 7262 ```````````````````````````````` example 7263 [link](\(foo\)) 7264 . 7265 <p><a href="(foo)">link</a></p> 7266 ```````````````````````````````` 7267 7268 Any number parentheses are allowed without escaping, as long as they are 7269 balanced: 7270 7271 ```````````````````````````````` example 7272 [link](foo(and(bar))) 7273 . 7274 <p><a href="foo(and(bar))">link</a></p> 7275 ```````````````````````````````` 7276 7277 However, if you have unbalanced parentheses, you need to escape or use the 7278 `<...>` form: 7279 7280 ```````````````````````````````` example 7281 [link](foo\(and\(bar\)) 7282 . 7283 <p><a href="foo(and(bar)">link</a></p> 7284 ```````````````````````````````` 7285 7286 7287 ```````````````````````````````` example 7288 [link](<foo(and(bar)>) 7289 . 7290 <p><a href="foo(and(bar)">link</a></p> 7291 ```````````````````````````````` 7292 7293 7294 Parentheses and other symbols can also be escaped, as usual 7295 in Markdown: 7296 7297 ```````````````````````````````` example 7298 [link](foo\)\:) 7299 . 7300 <p><a href="foo):">link</a></p> 7301 ```````````````````````````````` 7302 7303 7304 A link can contain fragment identifiers and queries: 7305 7306 ```````````````````````````````` example 7307 [link](#fragment) 7308 7309 [link](http://example.com#fragment) 7310 7311 [link](http://example.com?foo=3#frag) 7312 . 7313 <p><a href="#fragment">link</a></p> 7314 <p><a href="http://example.com#fragment">link</a></p> 7315 <p><a href="http://example.com?foo=3#frag">link</a></p> 7316 ```````````````````````````````` 7317 7318 7319 Note that a backslash before a non-escapable character is 7320 just a backslash: 7321 7322 ```````````````````````````````` example 7323 [link](foo\bar) 7324 . 7325 <p><a href="foo%5Cbar">link</a></p> 7326 ```````````````````````````````` 7327 7328 7329 URL-escaping should be left alone inside the destination, as all 7330 URL-escaped characters are also valid URL characters. Entity and 7331 numerical character references in the destination will be parsed 7332 into the corresponding Unicode code points, as usual. These may 7333 be optionally URL-escaped when written as HTML, but this spec 7334 does not enforce any particular policy for rendering URLs in 7335 HTML or other formats. Renderers may make different decisions 7336 about how to escape or normalize URLs in the output. 7337 7338 ```````````````````````````````` example 7339 [link](foo%20bä) 7340 . 7341 <p><a href="foo%20b%C3%A4">link</a></p> 7342 ```````````````````````````````` 7343 7344 7345 Note that, because titles can often be parsed as destinations, 7346 if you try to omit the destination and keep the title, you'll 7347 get unexpected results: 7348 7349 ```````````````````````````````` example 7350 [link]("title") 7351 . 7352 <p><a href="%22title%22">link</a></p> 7353 ```````````````````````````````` 7354 7355 7356 Titles may be in single quotes, double quotes, or parentheses: 7357 7358 ```````````````````````````````` example 7359 [link](/url "title") 7360 [link](/url 'title') 7361 [link](/url (title)) 7362 . 7363 <p><a href="/url" title="title">link</a> 7364 <a href="/url" title="title">link</a> 7365 <a href="/url" title="title">link</a></p> 7366 ```````````````````````````````` 7367 7368 7369 Backslash escapes and entity and numeric character references 7370 may be used in titles: 7371 7372 ```````````````````````````````` example 7373 [link](/url "title \""") 7374 . 7375 <p><a href="/url" title="title """>link</a></p> 7376 ```````````````````````````````` 7377 7378 7379 Titles must be separated from the link using a [whitespace]. 7380 Other [Unicode whitespace] like non-breaking space doesn't work. 7381 7382 ```````````````````````````````` example 7383 [link](/url "title") 7384 . 7385 <p><a href="/url%C2%A0%22title%22">link</a></p> 7386 ```````````````````````````````` 7387 7388 7389 Nested balanced quotes are not allowed without escaping: 7390 7391 ```````````````````````````````` example 7392 [link](/url "title "and" title") 7393 . 7394 <p>[link](/url "title "and" title")</p> 7395 ```````````````````````````````` 7396 7397 7398 But it is easy to work around this by using a different quote type: 7399 7400 ```````````````````````````````` example 7401 [link](/url 'title "and" title') 7402 . 7403 <p><a href="/url" title="title "and" title">link</a></p> 7404 ```````````````````````````````` 7405 7406 7407 (Note: `Markdown.pl` did allow double quotes inside a double-quoted 7408 title, and its test suite included a test demonstrating this. 7409 But it is hard to see a good rationale for the extra complexity this 7410 brings, since there are already many ways---backslash escaping, 7411 entity and numeric character references, or using a different 7412 quote type for the enclosing title---to write titles containing 7413 double quotes. `Markdown.pl`'s handling of titles has a number 7414 of other strange features. For example, it allows single-quoted 7415 titles in inline links, but not reference links. And, in 7416 reference links but not inline links, it allows a title to begin 7417 with `"` and end with `)`. `Markdown.pl` 1.0.1 even allows 7418 titles with no closing quotation mark, though 1.0.2b8 does not. 7419 It seems preferable to adopt a simple, rational rule that works 7420 the same way in inline links and link reference definitions.) 7421 7422 [Whitespace] is allowed around the destination and title: 7423 7424 ```````````````````````````````` example 7425 [link]( /uri 7426 "title" ) 7427 . 7428 <p><a href="/uri" title="title">link</a></p> 7429 ```````````````````````````````` 7430 7431 7432 But it is not allowed between the link text and the 7433 following parenthesis: 7434 7435 ```````````````````````````````` example 7436 [link] (/uri) 7437 . 7438 <p>[link] (/uri)</p> 7439 ```````````````````````````````` 7440 7441 7442 The link text may contain balanced brackets, but not unbalanced ones, 7443 unless they are escaped: 7444 7445 ```````````````````````````````` example 7446 [link [foo [bar]]](/uri) 7447 . 7448 <p><a href="/uri">link [foo [bar]]</a></p> 7449 ```````````````````````````````` 7450 7451 7452 ```````````````````````````````` example 7453 [link] bar](/uri) 7454 . 7455 <p>[link] bar](/uri)</p> 7456 ```````````````````````````````` 7457 7458 7459 ```````````````````````````````` example 7460 [link [bar](/uri) 7461 . 7462 <p>[link <a href="/uri">bar</a></p> 7463 ```````````````````````````````` 7464 7465 7466 ```````````````````````````````` example 7467 [link \[bar](/uri) 7468 . 7469 <p><a href="/uri">link [bar</a></p> 7470 ```````````````````````````````` 7471 7472 7473 The link text may contain inline content: 7474 7475 ```````````````````````````````` example 7476 [link *foo **bar** `#`*](/uri) 7477 . 7478 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 7479 ```````````````````````````````` 7480 7481 7482 ```````````````````````````````` example 7483 [![moon](moon.jpg)](/uri) 7484 . 7485 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 7486 ```````````````````````````````` 7487 7488 7489 However, links may not contain other links, at any level of nesting. 7490 7491 ```````````````````````````````` example 7492 [foo [bar](/uri)](/uri) 7493 . 7494 <p>[foo <a href="/uri">bar</a>](/uri)</p> 7495 ```````````````````````````````` 7496 7497 7498 ```````````````````````````````` example 7499 [foo *[bar [baz](/uri)](/uri)*](/uri) 7500 . 7501 <p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p> 7502 ```````````````````````````````` 7503 7504 7505 ```````````````````````````````` example 7506 ![[[foo](uri1)](uri2)](uri3) 7507 . 7508 <p><img src="uri3" alt="[foo](uri2)" /></p> 7509 ```````````````````````````````` 7510 7511 7512 These cases illustrate the precedence of link text grouping over 7513 emphasis grouping: 7514 7515 ```````````````````````````````` example 7516 *[foo*](/uri) 7517 . 7518 <p>*<a href="/uri">foo*</a></p> 7519 ```````````````````````````````` 7520 7521 7522 ```````````````````````````````` example 7523 [foo *bar](baz*) 7524 . 7525 <p><a href="baz*">foo *bar</a></p> 7526 ```````````````````````````````` 7527 7528 7529 Note that brackets that *aren't* part of links do not take 7530 precedence: 7531 7532 ```````````````````````````````` example 7533 *foo [bar* baz] 7534 . 7535 <p><em>foo [bar</em> baz]</p> 7536 ```````````````````````````````` 7537 7538 7539 These cases illustrate the precedence of HTML tags, code spans, 7540 and autolinks over link grouping: 7541 7542 ```````````````````````````````` example 7543 [foo <bar attr="](baz)"> 7544 . 7545 <p>[foo <bar attr="](baz)"></p> 7546 ```````````````````````````````` 7547 7548 7549 ```````````````````````````````` example 7550 [foo`](/uri)` 7551 . 7552 <p>[foo<code>](/uri)</code></p> 7553 ```````````````````````````````` 7554 7555 7556 ```````````````````````````````` example 7557 [foo<http://example.com/?search=](uri)> 7558 . 7559 <p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p> 7560 ```````````````````````````````` 7561 7562 7563 There are three kinds of [reference link](@)s: 7564 [full](#full-reference-link), [collapsed](#collapsed-reference-link), 7565 and [shortcut](#shortcut-reference-link). 7566 7567 A [full reference link](@) 7568 consists of a [link text] immediately followed by a [link label] 7569 that [matches] a [link reference definition] elsewhere in the document. 7570 7571 A [link label](@) begins with a left bracket (`[`) and ends 7572 with the first right bracket (`]`) that is not backslash-escaped. 7573 Between these brackets there must be at least one [non-whitespace character]. 7574 Unescaped square bracket characters are not allowed in 7575 [link labels]. A link label can have at most 999 7576 characters inside the square brackets. 7577 7578 One label [matches](@) 7579 another just in case their normalized forms are equal. To normalize a 7580 label, perform the *Unicode case fold* and collapse consecutive internal 7581 [whitespace] to a single space. If there are multiple 7582 matching reference link definitions, the one that comes first in the 7583 document is used. (It is desirable in such cases to emit a warning.) 7584 7585 The contents of the first link label are parsed as inlines, which are 7586 used as the link's text. The link's URI and title are provided by the 7587 matching [link reference definition]. 7588 7589 Here is a simple example: 7590 7591 ```````````````````````````````` example 7592 [foo][bar] 7593 7594 [bar]: /url "title" 7595 . 7596 <p><a href="/url" title="title">foo</a></p> 7597 ```````````````````````````````` 7598 7599 7600 The rules for the [link text] are the same as with 7601 [inline links]. Thus: 7602 7603 The link text may contain balanced brackets, but not unbalanced ones, 7604 unless they are escaped: 7605 7606 ```````````````````````````````` example 7607 [link [foo [bar]]][ref] 7608 7609 [ref]: /uri 7610 . 7611 <p><a href="/uri">link [foo [bar]]</a></p> 7612 ```````````````````````````````` 7613 7614 7615 ```````````````````````````````` example 7616 [link \[bar][ref] 7617 7618 [ref]: /uri 7619 . 7620 <p><a href="/uri">link [bar</a></p> 7621 ```````````````````````````````` 7622 7623 7624 The link text may contain inline content: 7625 7626 ```````````````````````````````` example 7627 [link *foo **bar** `#`*][ref] 7628 7629 [ref]: /uri 7630 . 7631 <p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p> 7632 ```````````````````````````````` 7633 7634 7635 ```````````````````````````````` example 7636 [![moon](moon.jpg)][ref] 7637 7638 [ref]: /uri 7639 . 7640 <p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p> 7641 ```````````````````````````````` 7642 7643 7644 However, links may not contain other links, at any level of nesting. 7645 7646 ```````````````````````````````` example 7647 [foo [bar](/uri)][ref] 7648 7649 [ref]: /uri 7650 . 7651 <p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p> 7652 ```````````````````````````````` 7653 7654 7655 ```````````````````````````````` example 7656 [foo *bar [baz][ref]*][ref] 7657 7658 [ref]: /uri 7659 . 7660 <p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p> 7661 ```````````````````````````````` 7662 7663 7664 (In the examples above, we have two [shortcut reference links] 7665 instead of one [full reference link].) 7666 7667 The following cases illustrate the precedence of link text grouping over 7668 emphasis grouping: 7669 7670 ```````````````````````````````` example 7671 *[foo*][ref] 7672 7673 [ref]: /uri 7674 . 7675 <p>*<a href="/uri">foo*</a></p> 7676 ```````````````````````````````` 7677 7678 7679 ```````````````````````````````` example 7680 [foo *bar][ref] 7681 7682 [ref]: /uri 7683 . 7684 <p><a href="/uri">foo *bar</a></p> 7685 ```````````````````````````````` 7686 7687 7688 These cases illustrate the precedence of HTML tags, code spans, 7689 and autolinks over link grouping: 7690 7691 ```````````````````````````````` example 7692 [foo <bar attr="][ref]"> 7693 7694 [ref]: /uri 7695 . 7696 <p>[foo <bar attr="][ref]"></p> 7697 ```````````````````````````````` 7698 7699 7700 ```````````````````````````````` example 7701 [foo`][ref]` 7702 7703 [ref]: /uri 7704 . 7705 <p>[foo<code>][ref]</code></p> 7706 ```````````````````````````````` 7707 7708 7709 ```````````````````````````````` example 7710 [foo<http://example.com/?search=][ref]> 7711 7712 [ref]: /uri 7713 . 7714 <p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p> 7715 ```````````````````````````````` 7716 7717 7718 Matching is case-insensitive: 7719 7720 ```````````````````````````````` example 7721 [foo][BaR] 7722 7723 [bar]: /url "title" 7724 . 7725 <p><a href="/url" title="title">foo</a></p> 7726 ```````````````````````````````` 7727 7728 7729 Unicode case fold is used: 7730 7731 ```````````````````````````````` example 7732 [Толпой][Толпой] is a Russian word. 7733 7734 [ТОЛПОЙ]: /url 7735 . 7736 <p><a href="/url">Толпой</a> is a Russian word.</p> 7737 ```````````````````````````````` 7738 7739 7740 Consecutive internal [whitespace] is treated as one space for 7741 purposes of determining matching: 7742 7743 ```````````````````````````````` example 7744 [Foo 7745 bar]: /url 7746 7747 [Baz][Foo bar] 7748 . 7749 <p><a href="/url">Baz</a></p> 7750 ```````````````````````````````` 7751 7752 7753 No [whitespace] is allowed between the [link text] and the 7754 [link label]: 7755 7756 ```````````````````````````````` example 7757 [foo] [bar] 7758 7759 [bar]: /url "title" 7760 . 7761 <p>[foo] <a href="/url" title="title">bar</a></p> 7762 ```````````````````````````````` 7763 7764 7765 ```````````````````````````````` example 7766 [foo] 7767 [bar] 7768 7769 [bar]: /url "title" 7770 . 7771 <p>[foo] 7772 <a href="/url" title="title">bar</a></p> 7773 ```````````````````````````````` 7774 7775 7776 This is a departure from John Gruber's original Markdown syntax 7777 description, which explicitly allows whitespace between the link 7778 text and the link label. It brings reference links in line with 7779 [inline links], which (according to both original Markdown and 7780 this spec) cannot have whitespace after the link text. More 7781 importantly, it prevents inadvertent capture of consecutive 7782 [shortcut reference links]. If whitespace is allowed between the 7783 link text and the link label, then in the following we will have 7784 a single reference link, not two shortcut reference links, as 7785 intended: 7786 7787 ``` markdown 7788 [foo] 7789 [bar] 7790 7791 [foo]: /url1 7792 [bar]: /url2 7793 ``` 7794 7795 (Note that [shortcut reference links] were introduced by Gruber 7796 himself in a beta version of `Markdown.pl`, but never included 7797 in the official syntax description. Without shortcut reference 7798 links, it is harmless to allow space between the link text and 7799 link label; but once shortcut references are introduced, it is 7800 too dangerous to allow this, as it frequently leads to 7801 unintended results.) 7802 7803 When there are multiple matching [link reference definitions], 7804 the first is used: 7805 7806 ```````````````````````````````` example 7807 [foo]: /url1 7808 7809 [foo]: /url2 7810 7811 [bar][foo] 7812 . 7813 <p><a href="/url1">bar</a></p> 7814 ```````````````````````````````` 7815 7816 7817 Note that matching is performed on normalized strings, not parsed 7818 inline content. So the following does not match, even though the 7819 labels define equivalent inline content: 7820 7821 ```````````````````````````````` example 7822 [bar][foo\!] 7823 7824 [foo!]: /url 7825 . 7826 <p>[bar][foo!]</p> 7827 ```````````````````````````````` 7828 7829 7830 [Link labels] cannot contain brackets, unless they are 7831 backslash-escaped: 7832 7833 ```````````````````````````````` example 7834 [foo][ref[] 7835 7836 [ref[]: /uri 7837 . 7838 <p>[foo][ref[]</p> 7839 <p>[ref[]: /uri</p> 7840 ```````````````````````````````` 7841 7842 7843 ```````````````````````````````` example 7844 [foo][ref[bar]] 7845 7846 [ref[bar]]: /uri 7847 . 7848 <p>[foo][ref[bar]]</p> 7849 <p>[ref[bar]]: /uri</p> 7850 ```````````````````````````````` 7851 7852 7853 ```````````````````````````````` example 7854 [[[foo]]] 7855 7856 [[[foo]]]: /url 7857 . 7858 <p>[[[foo]]]</p> 7859 <p>[[[foo]]]: /url</p> 7860 ```````````````````````````````` 7861 7862 7863 ```````````````````````````````` example 7864 [foo][ref\[] 7865 7866 [ref\[]: /uri 7867 . 7868 <p><a href="/uri">foo</a></p> 7869 ```````````````````````````````` 7870 7871 7872 Note that in this example `]` is not backslash-escaped: 7873 7874 ```````````````````````````````` example 7875 [bar\\]: /uri 7876 7877 [bar\\] 7878 . 7879 <p><a href="/uri">bar\</a></p> 7880 ```````````````````````````````` 7881 7882 7883 A [link label] must contain at least one [non-whitespace character]: 7884 7885 ```````````````````````````````` example 7886 [] 7887 7888 []: /uri 7889 . 7890 <p>[]</p> 7891 <p>[]: /uri</p> 7892 ```````````````````````````````` 7893 7894 7895 ```````````````````````````````` example 7896 [ 7897 ] 7898 7899 [ 7900 ]: /uri 7901 . 7902 <p>[ 7903 ]</p> 7904 <p>[ 7905 ]: /uri</p> 7906 ```````````````````````````````` 7907 7908 7909 A [collapsed reference link](@) 7910 consists of a [link label] that [matches] a 7911 [link reference definition] elsewhere in the 7912 document, followed by the string `[]`. 7913 The contents of the first link label are parsed as inlines, 7914 which are used as the link's text. The link's URI and title are 7915 provided by the matching reference link definition. Thus, 7916 `[foo][]` is equivalent to `[foo][foo]`. 7917 7918 ```````````````````````````````` example 7919 [foo][] 7920 7921 [foo]: /url "title" 7922 . 7923 <p><a href="/url" title="title">foo</a></p> 7924 ```````````````````````````````` 7925 7926 7927 ```````````````````````````````` example 7928 [*foo* bar][] 7929 7930 [*foo* bar]: /url "title" 7931 . 7932 <p><a href="/url" title="title"><em>foo</em> bar</a></p> 7933 ```````````````````````````````` 7934 7935 7936 The link labels are case-insensitive: 7937 7938 ```````````````````````````````` example 7939 [Foo][] 7940 7941 [foo]: /url "title" 7942 . 7943 <p><a href="/url" title="title">Foo</a></p> 7944 ```````````````````````````````` 7945 7946 7947 7948 As with full reference links, [whitespace] is not 7949 allowed between the two sets of brackets: 7950 7951 ```````````````````````````````` example 7952 [foo] 7953 [] 7954 7955 [foo]: /url "title" 7956 . 7957 <p><a href="/url" title="title">foo</a> 7958 []</p> 7959 ```````````````````````````````` 7960 7961 7962 A [shortcut reference link](@) 7963 consists of a [link label] that [matches] a 7964 [link reference definition] elsewhere in the 7965 document and is not followed by `[]` or a link label. 7966 The contents of the first link label are parsed as inlines, 7967 which are used as the link's text. The link's URI and title 7968 are provided by the matching link reference definition. 7969 Thus, `[foo]` is equivalent to `[foo][]`. 7970 7971 ```````````````````````````````` example 7972 [foo] 7973 7974 [foo]: /url "title" 7975 . 7976 <p><a href="/url" title="title">foo</a></p> 7977 ```````````````````````````````` 7978 7979 7980 ```````````````````````````````` example 7981 [*foo* bar] 7982 7983 [*foo* bar]: /url "title" 7984 . 7985 <p><a href="/url" title="title"><em>foo</em> bar</a></p> 7986 ```````````````````````````````` 7987 7988 7989 ```````````````````````````````` example 7990 [[*foo* bar]] 7991 7992 [*foo* bar]: /url "title" 7993 . 7994 <p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p> 7995 ```````````````````````````````` 7996 7997 7998 ```````````````````````````````` example 7999 [[bar [foo] 8000 8001 [foo]: /url 8002 . 8003 <p>[[bar <a href="/url">foo</a></p> 8004 ```````````````````````````````` 8005 8006 8007 The link labels are case-insensitive: 8008 8009 ```````````````````````````````` example 8010 [Foo] 8011 8012 [foo]: /url "title" 8013 . 8014 <p><a href="/url" title="title">Foo</a></p> 8015 ```````````````````````````````` 8016 8017 8018 A space after the link text should be preserved: 8019 8020 ```````````````````````````````` example 8021 [foo] bar 8022 8023 [foo]: /url 8024 . 8025 <p><a href="/url">foo</a> bar</p> 8026 ```````````````````````````````` 8027 8028 8029 If you just want bracketed text, you can backslash-escape the 8030 opening bracket to avoid links: 8031 8032 ```````````````````````````````` example 8033 \[foo] 8034 8035 [foo]: /url "title" 8036 . 8037 <p>[foo]</p> 8038 ```````````````````````````````` 8039 8040 8041 Note that this is a link, because a link label ends with the first 8042 following closing bracket: 8043 8044 ```````````````````````````````` example 8045 [foo*]: /url 8046 8047 *[foo*] 8048 . 8049 <p>*<a href="/url">foo*</a></p> 8050 ```````````````````````````````` 8051 8052 8053 Full and compact references take precedence over shortcut 8054 references: 8055 8056 ```````````````````````````````` example 8057 [foo][bar] 8058 8059 [foo]: /url1 8060 [bar]: /url2 8061 . 8062 <p><a href="/url2">foo</a></p> 8063 ```````````````````````````````` 8064 8065 ```````````````````````````````` example 8066 [foo][] 8067 8068 [foo]: /url1 8069 . 8070 <p><a href="/url1">foo</a></p> 8071 ```````````````````````````````` 8072 8073 Inline links also take precedence: 8074 8075 ```````````````````````````````` example 8076 [foo]() 8077 8078 [foo]: /url1 8079 . 8080 <p><a href="">foo</a></p> 8081 ```````````````````````````````` 8082 8083 ```````````````````````````````` example 8084 [foo](not a link) 8085 8086 [foo]: /url1 8087 . 8088 <p><a href="/url1">foo</a>(not a link)</p> 8089 ```````````````````````````````` 8090 8091 In the following case `[bar][baz]` is parsed as a reference, 8092 `[foo]` as normal text: 8093 8094 ```````````````````````````````` example 8095 [foo][bar][baz] 8096 8097 [baz]: /url 8098 . 8099 <p>[foo]<a href="/url">bar</a></p> 8100 ```````````````````````````````` 8101 8102 8103 Here, though, `[foo][bar]` is parsed as a reference, since 8104 `[bar]` is defined: 8105 8106 ```````````````````````````````` example 8107 [foo][bar][baz] 8108 8109 [baz]: /url1 8110 [bar]: /url2 8111 . 8112 <p><a href="/url2">foo</a><a href="/url1">baz</a></p> 8113 ```````````````````````````````` 8114 8115 8116 Here `[foo]` is not parsed as a shortcut reference, because it 8117 is followed by a link label (even though `[bar]` is not defined): 8118 8119 ```````````````````````````````` example 8120 [foo][bar][baz] 8121 8122 [baz]: /url1 8123 [foo]: /url2 8124 . 8125 <p>[foo]<a href="/url1">bar</a></p> 8126 ```````````````````````````````` 8127 8128 8129 8130 ## Images 8131 8132 Syntax for images is like the syntax for links, with one 8133 difference. Instead of [link text], we have an 8134 [image description](@). The rules for this are the 8135 same as for [link text], except that (a) an 8136 image description starts with `![` rather than `[`, and 8137 (b) an image description may contain links. 8138 An image description has inline elements 8139 as its contents. When an image is rendered to HTML, 8140 this is standardly used as the image's `alt` attribute. 8141 8142 ```````````````````````````````` example 8143 ![foo](/url "title") 8144 . 8145 <p><img src="/url" alt="foo" title="title" /></p> 8146 ```````````````````````````````` 8147 8148 8149 ```````````````````````````````` example 8150 ![foo *bar*] 8151 8152 [foo *bar*]: train.jpg "train & tracks" 8153 . 8154 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8155 ```````````````````````````````` 8156 8157 8158 ```````````````````````````````` example 8159 ![foo ![bar](/url)](/url2) 8160 . 8161 <p><img src="/url2" alt="foo bar" /></p> 8162 ```````````````````````````````` 8163 8164 8165 ```````````````````````````````` example 8166 ![foo [bar](/url)](/url2) 8167 . 8168 <p><img src="/url2" alt="foo bar" /></p> 8169 ```````````````````````````````` 8170 8171 8172 Though this spec is concerned with parsing, not rendering, it is 8173 recommended that in rendering to HTML, only the plain string content 8174 of the [image description] be used. Note that in 8175 the above example, the alt attribute's value is `foo bar`, not `foo 8176 [bar](/url)` or `foo <a href="/url">bar</a>`. Only the plain string 8177 content is rendered, without formatting. 8178 8179 ```````````````````````````````` example 8180 ![foo *bar*][] 8181 8182 [foo *bar*]: train.jpg "train & tracks" 8183 . 8184 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8185 ```````````````````````````````` 8186 8187 8188 ```````````````````````````````` example 8189 ![foo *bar*][foobar] 8190 8191 [FOOBAR]: train.jpg "train & tracks" 8192 . 8193 <p><img src="train.jpg" alt="foo bar" title="train & tracks" /></p> 8194 ```````````````````````````````` 8195 8196 8197 ```````````````````````````````` example 8198 ![foo](train.jpg) 8199 . 8200 <p><img src="train.jpg" alt="foo" /></p> 8201 ```````````````````````````````` 8202 8203 8204 ```````````````````````````````` example 8205 My ![foo bar](/path/to/train.jpg "title" ) 8206 . 8207 <p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p> 8208 ```````````````````````````````` 8209 8210 8211 ```````````````````````````````` example 8212 ![foo](<url>) 8213 . 8214 <p><img src="url" alt="foo" /></p> 8215 ```````````````````````````````` 8216 8217 8218 ```````````````````````````````` example 8219 ![](/url) 8220 . 8221 <p><img src="/url" alt="" /></p> 8222 ```````````````````````````````` 8223 8224 8225 Reference-style: 8226 8227 ```````````````````````````````` example 8228 ![foo][bar] 8229 8230 [bar]: /url 8231 . 8232 <p><img src="/url" alt="foo" /></p> 8233 ```````````````````````````````` 8234 8235 8236 ```````````````````````````````` example 8237 ![foo][bar] 8238 8239 [BAR]: /url 8240 . 8241 <p><img src="/url" alt="foo" /></p> 8242 ```````````````````````````````` 8243 8244 8245 Collapsed: 8246 8247 ```````````````````````````````` example 8248 ![foo][] 8249 8250 [foo]: /url "title" 8251 . 8252 <p><img src="/url" alt="foo" title="title" /></p> 8253 ```````````````````````````````` 8254 8255 8256 ```````````````````````````````` example 8257 ![*foo* bar][] 8258 8259 [*foo* bar]: /url "title" 8260 . 8261 <p><img src="/url" alt="foo bar" title="title" /></p> 8262 ```````````````````````````````` 8263 8264 8265 The labels are case-insensitive: 8266 8267 ```````````````````````````````` example 8268 ![Foo][] 8269 8270 [foo]: /url "title" 8271 . 8272 <p><img src="/url" alt="Foo" title="title" /></p> 8273 ```````````````````````````````` 8274 8275 8276 As with reference links, [whitespace] is not allowed 8277 between the two sets of brackets: 8278 8279 ```````````````````````````````` example 8280 ![foo] 8281 [] 8282 8283 [foo]: /url "title" 8284 . 8285 <p><img src="/url" alt="foo" title="title" /> 8286 []</p> 8287 ```````````````````````````````` 8288 8289 8290 Shortcut: 8291 8292 ```````````````````````````````` example 8293 ![foo] 8294 8295 [foo]: /url "title" 8296 . 8297 <p><img src="/url" alt="foo" title="title" /></p> 8298 ```````````````````````````````` 8299 8300 8301 ```````````````````````````````` example 8302 ![*foo* bar] 8303 8304 [*foo* bar]: /url "title" 8305 . 8306 <p><img src="/url" alt="foo bar" title="title" /></p> 8307 ```````````````````````````````` 8308 8309 8310 Note that link labels cannot contain unescaped brackets: 8311 8312 ```````````````````````````````` example 8313 ![[foo]] 8314 8315 [[foo]]: /url "title" 8316 . 8317 <p>![[foo]]</p> 8318 <p>[[foo]]: /url "title"</p> 8319 ```````````````````````````````` 8320 8321 8322 The link labels are case-insensitive: 8323 8324 ```````````````````````````````` example 8325 ![Foo] 8326 8327 [foo]: /url "title" 8328 . 8329 <p><img src="/url" alt="Foo" title="title" /></p> 8330 ```````````````````````````````` 8331 8332 8333 If you just want a literal `!` followed by bracketed text, you can 8334 backslash-escape the opening `[`: 8335 8336 ```````````````````````````````` example 8337 !\[foo] 8338 8339 [foo]: /url "title" 8340 . 8341 <p>![foo]</p> 8342 ```````````````````````````````` 8343 8344 8345 If you want a link after a literal `!`, backslash-escape the 8346 `!`: 8347 8348 ```````````````````````````````` example 8349 \![foo] 8350 8351 [foo]: /url "title" 8352 . 8353 <p>!<a href="/url" title="title">foo</a></p> 8354 ```````````````````````````````` 8355 8356 8357 ## Autolinks 8358 8359 [Autolink](@)s are absolute URIs and email addresses inside 8360 `<` and `>`. They are parsed as links, with the URL or email address 8361 as the link label. 8362 8363 A [URI autolink](@) consists of `<`, followed by an 8364 [absolute URI] not containing `<`, followed by `>`. It is parsed as 8365 a link to the URI, with the URI as the link's label. 8366 8367 An [absolute URI](@), 8368 for these purposes, consists of a [scheme] followed by a colon (`:`) 8369 followed by zero or more characters other than ASCII 8370 [whitespace] and control characters, `<`, and `>`. If 8371 the URI includes these characters, they must be percent-encoded 8372 (e.g. `%20` for a space). 8373 8374 For purposes of this spec, a [scheme](@) is any sequence 8375 of 2--32 characters beginning with an ASCII letter and followed 8376 by any combination of ASCII letters, digits, or the symbols plus 8377 ("+"), period ("."), or hyphen ("-"). 8378 8379 Here are some valid autolinks: 8380 8381 ```````````````````````````````` example 8382 <http://foo.bar.baz> 8383 . 8384 <p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p> 8385 ```````````````````````````````` 8386 8387 8388 ```````````````````````````````` example 8389 <http://foo.bar.baz/test?q=hello&id=22&boolean> 8390 . 8391 <p><a href="http://foo.bar.baz/test?q=hello&id=22&boolean">http://foo.bar.baz/test?q=hello&id=22&boolean</a></p> 8392 ```````````````````````````````` 8393 8394 8395 ```````````````````````````````` example 8396 <irc://foo.bar:2233/baz> 8397 . 8398 <p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p> 8399 ```````````````````````````````` 8400 8401 8402 Uppercase is also fine: 8403 8404 ```````````````````````````````` example 8405 <MAILTO:FOO@BAR.BAZ> 8406 . 8407 <p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p> 8408 ```````````````````````````````` 8409 8410 8411 Note that many strings that count as [absolute URIs] for 8412 purposes of this spec are not valid URIs, because their 8413 schemes are not registered or because of other problems 8414 with their syntax: 8415 8416 ```````````````````````````````` example 8417 <a+b+c:d> 8418 . 8419 <p><a href="a+b+c:d">a+b+c:d</a></p> 8420 ```````````````````````````````` 8421 8422 8423 ```````````````````````````````` example 8424 <made-up-scheme://foo,bar> 8425 . 8426 <p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p> 8427 ```````````````````````````````` 8428 8429 8430 ```````````````````````````````` example 8431 <http://../> 8432 . 8433 <p><a href="http://../">http://../</a></p> 8434 ```````````````````````````````` 8435 8436 8437 ```````````````````````````````` example 8438 <localhost:5001/foo> 8439 . 8440 <p><a href="localhost:5001/foo">localhost:5001/foo</a></p> 8441 ```````````````````````````````` 8442 8443 8444 Spaces are not allowed in autolinks: 8445 8446 ```````````````````````````````` example 8447 <http://foo.bar/baz bim> 8448 . 8449 <p><http://foo.bar/baz bim></p> 8450 ```````````````````````````````` 8451 8452 8453 Backslash-escapes do not work inside autolinks: 8454 8455 ```````````````````````````````` example 8456 <http://example.com/\[\> 8457 . 8458 <p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p> 8459 ```````````````````````````````` 8460 8461 8462 An [email autolink](@) 8463 consists of `<`, followed by an [email address], 8464 followed by `>`. The link's label is the email address, 8465 and the URL is `mailto:` followed by the email address. 8466 8467 An [email address](@), 8468 for these purposes, is anything that matches 8469 the [non-normative regex from the HTML5 8470 spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)): 8471 8472 /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])? 8473 (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/ 8474 8475 Examples of email autolinks: 8476 8477 ```````````````````````````````` example 8478 <foo@bar.example.com> 8479 . 8480 <p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p> 8481 ```````````````````````````````` 8482 8483 8484 ```````````````````````````````` example 8485 <foo+special@Bar.baz-bar0.com> 8486 . 8487 <p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p> 8488 ```````````````````````````````` 8489 8490 8491 Backslash-escapes do not work inside email autolinks: 8492 8493 ```````````````````````````````` example 8494 <foo\+@bar.example.com> 8495 . 8496 <p><foo+@bar.example.com></p> 8497 ```````````````````````````````` 8498 8499 8500 These are not autolinks: 8501 8502 ```````````````````````````````` example 8503 <> 8504 . 8505 <p><></p> 8506 ```````````````````````````````` 8507 8508 8509 ```````````````````````````````` example 8510 < http://foo.bar > 8511 . 8512 <p>< http://foo.bar ></p> 8513 ```````````````````````````````` 8514 8515 8516 ```````````````````````````````` example 8517 <m:abc> 8518 . 8519 <p><m:abc></p> 8520 ```````````````````````````````` 8521 8522 8523 ```````````````````````````````` example 8524 <foo.bar.baz> 8525 . 8526 <p><foo.bar.baz></p> 8527 ```````````````````````````````` 8528 8529 8530 ```````````````````````````````` example 8531 http://example.com 8532 . 8533 <p>http://example.com</p> 8534 ```````````````````````````````` 8535 8536 8537 ```````````````````````````````` example 8538 foo@bar.example.com 8539 . 8540 <p>foo@bar.example.com</p> 8541 ```````````````````````````````` 8542 8543 8544 ## Raw HTML 8545 8546 Text between `<` and `>` that looks like an HTML tag is parsed as a 8547 raw HTML tag and will be rendered in HTML without escaping. 8548 Tag and attribute names are not limited to current HTML tags, 8549 so custom tags (and even, say, DocBook tags) may be used. 8550 8551 Here is the grammar for tags: 8552 8553 A [tag name](@) consists of an ASCII letter 8554 followed by zero or more ASCII letters, digits, or 8555 hyphens (`-`). 8556 8557 An [attribute](@) consists of [whitespace], 8558 an [attribute name], and an optional 8559 [attribute value specification]. 8560 8561 An [attribute name](@) 8562 consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII 8563 letters, digits, `_`, `.`, `:`, or `-`. (Note: This is the XML 8564 specification restricted to ASCII. HTML5 is laxer.) 8565 8566 An [attribute value specification](@) 8567 consists of optional [whitespace], 8568 a `=` character, optional [whitespace], and an [attribute 8569 value]. 8570 8571 An [attribute value](@) 8572 consists of an [unquoted attribute value], 8573 a [single-quoted attribute value], or a [double-quoted attribute value]. 8574 8575 An [unquoted attribute value](@) 8576 is a nonempty string of characters not 8577 including spaces, `"`, `'`, `=`, `<`, `>`, or `` ` ``. 8578 8579 A [single-quoted attribute value](@) 8580 consists of `'`, zero or more 8581 characters not including `'`, and a final `'`. 8582 8583 A [double-quoted attribute value](@) 8584 consists of `"`, zero or more 8585 characters not including `"`, and a final `"`. 8586 8587 An [open tag](@) consists of a `<` character, a [tag name], 8588 zero or more [attributes], optional [whitespace], an optional `/` 8589 character, and a `>` character. 8590 8591 A [closing tag](@) consists of the string `</`, a 8592 [tag name], optional [whitespace], and the character `>`. 8593 8594 An [HTML comment](@) consists of `<!--` + *text* + `-->`, 8595 where *text* does not start with `>` or `->`, does not end with `-`, 8596 and does not contain `--`. (See the 8597 [HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).) 8598 8599 A [processing instruction](@) 8600 consists of the string `<?`, a string 8601 of characters not including the string `?>`, and the string 8602 `?>`. 8603 8604 A [declaration](@) consists of the 8605 string `<!`, a name consisting of one or more uppercase ASCII letters, 8606 [whitespace], a string of characters not including the 8607 character `>`, and the character `>`. 8608 8609 A [CDATA section](@) consists of 8610 the string `<![CDATA[`, a string of characters not including the string 8611 `]]>`, and the string `]]>`. 8612 8613 An [HTML tag](@) consists of an [open tag], a [closing tag], 8614 an [HTML comment], a [processing instruction], a [declaration], 8615 or a [CDATA section]. 8616 8617 Here are some simple open tags: 8618 8619 ```````````````````````````````` example 8620 <a><bab><c2c> 8621 . 8622 <p><a><bab><c2c></p> 8623 ```````````````````````````````` 8624 8625 8626 Empty elements: 8627 8628 ```````````````````````````````` example 8629 <a/><b2/> 8630 . 8631 <p><a/><b2/></p> 8632 ```````````````````````````````` 8633 8634 8635 [Whitespace] is allowed: 8636 8637 ```````````````````````````````` example 8638 <a /><b2 8639 data="foo" > 8640 . 8641 <p><a /><b2 8642 data="foo" ></p> 8643 ```````````````````````````````` 8644 8645 8646 With attributes: 8647 8648 ```````````````````````````````` example 8649 <a foo="bar" bam = 'baz <em>"</em>' 8650 _boolean zoop:33=zoop:33 /> 8651 . 8652 <p><a foo="bar" bam = 'baz <em>"</em>' 8653 _boolean zoop:33=zoop:33 /></p> 8654 ```````````````````````````````` 8655 8656 8657 Custom tag names can be used: 8658 8659 ```````````````````````````````` example 8660 Foo <responsive-image src="foo.jpg" /> 8661 . 8662 <p>Foo <responsive-image src="foo.jpg" /></p> 8663 ```````````````````````````````` 8664 8665 8666 Illegal tag names, not parsed as HTML: 8667 8668 ```````````````````````````````` example 8669 <33> <__> 8670 . 8671 <p><33> <__></p> 8672 ```````````````````````````````` 8673 8674 8675 Illegal attribute names: 8676 8677 ```````````````````````````````` example 8678 <a h*#ref="hi"> 8679 . 8680 <p><a h*#ref="hi"></p> 8681 ```````````````````````````````` 8682 8683 8684 Illegal attribute values: 8685 8686 ```````````````````````````````` example 8687 <a href="hi'> <a href=hi'> 8688 . 8689 <p><a href="hi'> <a href=hi'></p> 8690 ```````````````````````````````` 8691 8692 8693 Illegal [whitespace]: 8694 8695 ```````````````````````````````` example 8696 < a>< 8697 foo><bar/ > 8698 . 8699 <p>< a>< 8700 foo><bar/ ></p> 8701 ```````````````````````````````` 8702 8703 8704 Missing [whitespace]: 8705 8706 ```````````````````````````````` example 8707 <a href='bar'title=title> 8708 . 8709 <p><a href='bar'title=title></p> 8710 ```````````````````````````````` 8711 8712 8713 Closing tags: 8714 8715 ```````````````````````````````` example 8716 </a></foo > 8717 . 8718 <p></a></foo ></p> 8719 ```````````````````````````````` 8720 8721 8722 Illegal attributes in closing tag: 8723 8724 ```````````````````````````````` example 8725 </a href="foo"> 8726 . 8727 <p></a href="foo"></p> 8728 ```````````````````````````````` 8729 8730 8731 Comments: 8732 8733 ```````````````````````````````` example 8734 foo <!-- this is a 8735 comment - with hyphen --> 8736 . 8737 <p>foo <!-- this is a 8738 comment - with hyphen --></p> 8739 ```````````````````````````````` 8740 8741 8742 ```````````````````````````````` example 8743 foo <!-- not a comment -- two hyphens --> 8744 . 8745 <p>foo <!-- not a comment -- two hyphens --></p> 8746 ```````````````````````````````` 8747 8748 8749 Not comments: 8750 8751 ```````````````````````````````` example 8752 foo <!--> foo --> 8753 8754 foo <!-- foo---> 8755 . 8756 <p>foo <!--> foo --></p> 8757 <p>foo <!-- foo---></p> 8758 ```````````````````````````````` 8759 8760 8761 Processing instructions: 8762 8763 ```````````````````````````````` example 8764 foo <?php echo $a; ?> 8765 . 8766 <p>foo <?php echo $a; ?></p> 8767 ```````````````````````````````` 8768 8769 8770 Declarations: 8771 8772 ```````````````````````````````` example 8773 foo <!ELEMENT br EMPTY> 8774 . 8775 <p>foo <!ELEMENT br EMPTY></p> 8776 ```````````````````````````````` 8777 8778 8779 CDATA sections: 8780 8781 ```````````````````````````````` example 8782 foo <![CDATA[>&<]]> 8783 . 8784 <p>foo <![CDATA[>&<]]></p> 8785 ```````````````````````````````` 8786 8787 8788 Entity and numeric character references are preserved in HTML 8789 attributes: 8790 8791 ```````````````````````````````` example 8792 foo <a href="ö"> 8793 . 8794 <p>foo <a href="ö"></p> 8795 ```````````````````````````````` 8796 8797 8798 Backslash escapes do not work in HTML attributes: 8799 8800 ```````````````````````````````` example 8801 foo <a href="\*"> 8802 . 8803 <p>foo <a href="\*"></p> 8804 ```````````````````````````````` 8805 8806 8807 ```````````````````````````````` example 8808 <a href="\""> 8809 . 8810 <p><a href="""></p> 8811 ```````````````````````````````` 8812 8813 8814 ## Hard line breaks 8815 8816 A line break (not in a code span or HTML tag) that is preceded 8817 by two or more spaces and does not occur at the end of a block 8818 is parsed as a [hard line break](@) (rendered 8819 in HTML as a `<br />` tag): 8820 8821 ```````````````````````````````` example 8822 foo 8823 baz 8824 . 8825 <p>foo<br /> 8826 baz</p> 8827 ```````````````````````````````` 8828 8829 8830 For a more visible alternative, a backslash before the 8831 [line ending] may be used instead of two spaces: 8832 8833 ```````````````````````````````` example 8834 foo\ 8835 baz 8836 . 8837 <p>foo<br /> 8838 baz</p> 8839 ```````````````````````````````` 8840 8841 8842 More than two spaces can be used: 8843 8844 ```````````````````````````````` example 8845 foo 8846 baz 8847 . 8848 <p>foo<br /> 8849 baz</p> 8850 ```````````````````````````````` 8851 8852 8853 Leading spaces at the beginning of the next line are ignored: 8854 8855 ```````````````````````````````` example 8856 foo 8857 bar 8858 . 8859 <p>foo<br /> 8860 bar</p> 8861 ```````````````````````````````` 8862 8863 8864 ```````````````````````````````` example 8865 foo\ 8866 bar 8867 . 8868 <p>foo<br /> 8869 bar</p> 8870 ```````````````````````````````` 8871 8872 8873 Line breaks can occur inside emphasis, links, and other constructs 8874 that allow inline content: 8875 8876 ```````````````````````````````` example 8877 *foo 8878 bar* 8879 . 8880 <p><em>foo<br /> 8881 bar</em></p> 8882 ```````````````````````````````` 8883 8884 8885 ```````````````````````````````` example 8886 *foo\ 8887 bar* 8888 . 8889 <p><em>foo<br /> 8890 bar</em></p> 8891 ```````````````````````````````` 8892 8893 8894 Line breaks do not occur inside code spans 8895 8896 ```````````````````````````````` example 8897 `code 8898 span` 8899 . 8900 <p><code>code span</code></p> 8901 ```````````````````````````````` 8902 8903 8904 ```````````````````````````````` example 8905 `code\ 8906 span` 8907 . 8908 <p><code>code\ span</code></p> 8909 ```````````````````````````````` 8910 8911 8912 or HTML tags: 8913 8914 ```````````````````````````````` example 8915 <a href="foo 8916 bar"> 8917 . 8918 <p><a href="foo 8919 bar"></p> 8920 ```````````````````````````````` 8921 8922 8923 ```````````````````````````````` example 8924 <a href="foo\ 8925 bar"> 8926 . 8927 <p><a href="foo\ 8928 bar"></p> 8929 ```````````````````````````````` 8930 8931 8932 Hard line breaks are for separating inline content within a block. 8933 Neither syntax for hard line breaks works at the end of a paragraph or 8934 other block element: 8935 8936 ```````````````````````````````` example 8937 foo\ 8938 . 8939 <p>foo\</p> 8940 ```````````````````````````````` 8941 8942 8943 ```````````````````````````````` example 8944 foo 8945 . 8946 <p>foo</p> 8947 ```````````````````````````````` 8948 8949 8950 ```````````````````````````````` example 8951 ### foo\ 8952 . 8953 <h3>foo\</h3> 8954 ```````````````````````````````` 8955 8956 8957 ```````````````````````````````` example 8958 ### foo 8959 . 8960 <h3>foo</h3> 8961 ```````````````````````````````` 8962 8963 8964 ## Soft line breaks 8965 8966 A regular line break (not in a code span or HTML tag) that is not 8967 preceded by two or more spaces or a backslash is parsed as a 8968 [softbreak](@). (A softbreak may be rendered in HTML either as a 8969 [line ending] or as a space. The result will be the same in 8970 browsers. In the examples here, a [line ending] will be used.) 8971 8972 ```````````````````````````````` example 8973 foo 8974 baz 8975 . 8976 <p>foo 8977 baz</p> 8978 ```````````````````````````````` 8979 8980 8981 Spaces at the end of the line and beginning of the next line are 8982 removed: 8983 8984 ```````````````````````````````` example 8985 foo 8986 baz 8987 . 8988 <p>foo 8989 baz</p> 8990 ```````````````````````````````` 8991 8992 8993 A conforming parser may render a soft line break in HTML either as a 8994 line break or as a space. 8995 8996 A renderer may also provide an option to render soft line breaks 8997 as hard line breaks. 8998 8999 ## Textual content 9000 9001 Any characters not given an interpretation by the above rules will 9002 be parsed as plain textual content. 9003 9004 ```````````````````````````````` example 9005 hello $.;'there 9006 . 9007 <p>hello $.;'there</p> 9008 ```````````````````````````````` 9009 9010 9011 ```````````````````````````````` example 9012 Foo χρῆν 9013 . 9014 <p>Foo χρῆν</p> 9015 ```````````````````````````````` 9016 9017 9018 Internal spaces are preserved verbatim: 9019 9020 ```````````````````````````````` example 9021 Multiple spaces 9022 . 9023 <p>Multiple spaces</p> 9024 ```````````````````````````````` 9025 9026 9027 <!-- END TESTS --> 9028 9029 # Appendix: A parsing strategy 9030 9031 In this appendix we describe some features of the parsing strategy 9032 used in the CommonMark reference implementations. 9033 9034 ## Overview 9035 9036 Parsing has two phases: 9037 9038 1. In the first phase, lines of input are consumed and the block 9039 structure of the document---its division into paragraphs, block quotes, 9040 list items, and so on---is constructed. Text is assigned to these 9041 blocks but not parsed. Link reference definitions are parsed and a 9042 map of links is constructed. 9043 9044 2. In the second phase, the raw text contents of paragraphs and headings 9045 are parsed into sequences of Markdown inline elements (strings, 9046 code spans, links, emphasis, and so on), using the map of link 9047 references constructed in phase 1. 9048 9049 At each point in processing, the document is represented as a tree of 9050 **blocks**. The root of the tree is a `document` block. The `document` 9051 may have any number of other blocks as **children**. These children 9052 may, in turn, have other blocks as children. The last child of a block 9053 is normally considered **open**, meaning that subsequent lines of input 9054 can alter its contents. (Blocks that are not open are **closed**.) 9055 Here, for example, is a possible document tree, with the open blocks 9056 marked by arrows: 9057 9058 ``` tree 9059 -> document 9060 -> block_quote 9061 paragraph 9062 "Lorem ipsum dolor\nsit amet." 9063 -> list (type=bullet tight=true bullet_char=-) 9064 list_item 9065 paragraph 9066 "Qui *quodsi iracundia*" 9067 -> list_item 9068 -> paragraph 9069 "aliquando id" 9070 ``` 9071 9072 ## Phase 1: block structure 9073 9074 Each line that is processed has an effect on this tree. The line is 9075 analyzed and, depending on its contents, the document may be altered 9076 in one or more of the following ways: 9077 9078 1. One or more open blocks may be closed. 9079 2. One or more new blocks may be created as children of the 9080 last open block. 9081 3. Text may be added to the last (deepest) open block remaining 9082 on the tree. 9083 9084 Once a line has been incorporated into the tree in this way, 9085 it can be discarded, so input can be read in a stream. 9086 9087 For each line, we follow this procedure: 9088 9089 1. First we iterate through the open blocks, starting with the 9090 root document, and descending through last children down to the last 9091 open block. Each block imposes a condition that the line must satisfy 9092 if the block is to remain open. For example, a block quote requires a 9093 `>` character. A paragraph requires a non-blank line. 9094 In this phase we may match all or just some of the open 9095 blocks. But we cannot close unmatched blocks yet, because we may have a 9096 [lazy continuation line]. 9097 9098 2. Next, after consuming the continuation markers for existing 9099 blocks, we look for new block starts (e.g. `>` for a block quote). 9100 If we encounter a new block start, we close any blocks unmatched 9101 in step 1 before creating the new block as a child of the last 9102 matched block. 9103 9104 3. Finally, we look at the remainder of the line (after block 9105 markers like `>`, list markers, and indentation have been consumed). 9106 This is text that can be incorporated into the last open 9107 block (a paragraph, code block, heading, or raw HTML). 9108 9109 Setext headings are formed when we see a line of a paragraph 9110 that is a [setext heading underline]. 9111 9112 Reference link definitions are detected when a paragraph is closed; 9113 the accumulated text lines are parsed to see if they begin with 9114 one or more reference link definitions. Any remainder becomes a 9115 normal paragraph. 9116 9117 We can see how this works by considering how the tree above is 9118 generated by four lines of Markdown: 9119 9120 ``` markdown 9121 > Lorem ipsum dolor 9122 sit amet. 9123 > - Qui *quodsi iracundia* 9124 > - aliquando id 9125 ``` 9126 9127 At the outset, our document model is just 9128 9129 ``` tree 9130 -> document 9131 ``` 9132 9133 The first line of our text, 9134 9135 ``` markdown 9136 > Lorem ipsum dolor 9137 ``` 9138 9139 causes a `block_quote` block to be created as a child of our 9140 open `document` block, and a `paragraph` block as a child of 9141 the `block_quote`. Then the text is added to the last open 9142 block, the `paragraph`: 9143 9144 ``` tree 9145 -> document 9146 -> block_quote 9147 -> paragraph 9148 "Lorem ipsum dolor" 9149 ``` 9150 9151 The next line, 9152 9153 ``` markdown 9154 sit amet. 9155 ``` 9156 9157 is a "lazy continuation" of the open `paragraph`, so it gets added 9158 to the paragraph's text: 9159 9160 ``` tree 9161 -> document 9162 -> block_quote 9163 -> paragraph 9164 "Lorem ipsum dolor\nsit amet." 9165 ``` 9166 9167 The third line, 9168 9169 ``` markdown 9170 > - Qui *quodsi iracundia* 9171 ``` 9172 9173 causes the `paragraph` block to be closed, and a new `list` block 9174 opened as a child of the `block_quote`. A `list_item` is also 9175 added as a child of the `list`, and a `paragraph` as a child of 9176 the `list_item`. The text is then added to the new `paragraph`: 9177 9178 ``` tree 9179 -> document 9180 -> block_quote 9181 paragraph 9182 "Lorem ipsum dolor\nsit amet." 9183 -> list (type=bullet tight=true bullet_char=-) 9184 -> list_item 9185 -> paragraph 9186 "Qui *quodsi iracundia*" 9187 ``` 9188 9189 The fourth line, 9190 9191 ``` markdown 9192 > - aliquando id 9193 ``` 9194 9195 causes the `list_item` (and its child the `paragraph`) to be closed, 9196 and a new `list_item` opened up as child of the `list`. A `paragraph` 9197 is added as a child of the new `list_item`, to contain the text. 9198 We thus obtain the final tree: 9199 9200 ``` tree 9201 -> document 9202 -> block_quote 9203 paragraph 9204 "Lorem ipsum dolor\nsit amet." 9205 -> list (type=bullet tight=true bullet_char=-) 9206 list_item 9207 paragraph 9208 "Qui *quodsi iracundia*" 9209 -> list_item 9210 -> paragraph 9211 "aliquando id" 9212 ``` 9213 9214 ## Phase 2: inline structure 9215 9216 Once all of the input has been parsed, all open blocks are closed. 9217 9218 We then "walk the tree," visiting every node, and parse raw 9219 string contents of paragraphs and headings as inlines. At this 9220 point we have seen all the link reference definitions, so we can 9221 resolve reference links as we go. 9222 9223 ``` tree 9224 document 9225 block_quote 9226 paragraph 9227 str "Lorem ipsum dolor" 9228 softbreak 9229 str "sit amet." 9230 list (type=bullet tight=true bullet_char=-) 9231 list_item 9232 paragraph 9233 str "Qui " 9234 emph 9235 str "quodsi iracundia" 9236 list_item 9237 paragraph 9238 str "aliquando id" 9239 ``` 9240 9241 Notice how the [line ending] in the first paragraph has 9242 been parsed as a `softbreak`, and the asterisks in the first list item 9243 have become an `emph`. 9244 9245 ### An algorithm for parsing nested emphasis and links 9246 9247 By far the trickiest part of inline parsing is handling emphasis, 9248 strong emphasis, links, and images. This is done using the following 9249 algorithm. 9250 9251 When we're parsing inlines and we hit either 9252 9253 - a run of `*` or `_` characters, or 9254 - a `[` or `![` 9255 9256 we insert a text node with these symbols as its literal content, and we 9257 add a pointer to this text node to the [delimiter stack](@). 9258 9259 The [delimiter stack] is a doubly linked list. Each 9260 element contains a pointer to a text node, plus information about 9261 9262 - the type of delimiter (`[`, `![`, `*`, `_`) 9263 - the number of delimiters, 9264 - whether the delimiter is "active" (all are active to start), and 9265 - whether the delimiter is a potential opener, a potential closer, 9266 or both (which depends on what sort of characters precede 9267 and follow the delimiters). 9268 9269 When we hit a `]` character, we call the *look for link or image* 9270 procedure (see below). 9271 9272 When we hit the end of the input, we call the *process emphasis* 9273 procedure (see below), with `stack_bottom` = NULL. 9274 9275 #### *look for link or image* 9276 9277 Starting at the top of the delimiter stack, we look backwards 9278 through the stack for an opening `[` or `![` delimiter. 9279 9280 - If we don't find one, we return a literal text node `]`. 9281 9282 - If we do find one, but it's not *active*, we remove the inactive 9283 delimiter from the stack, and return a literal text node `]`. 9284 9285 - If we find one and it's active, then we parse ahead to see if 9286 we have an inline link/image, reference link/image, compact reference 9287 link/image, or shortcut reference link/image. 9288 9289 + If we don't, then we remove the opening delimiter from the 9290 delimiter stack and return a literal text node `]`. 9291 9292 + If we do, then 9293 9294 * We return a link or image node whose children are the inlines 9295 after the text node pointed to by the opening delimiter. 9296 9297 * We run *process emphasis* on these inlines, with the `[` opener 9298 as `stack_bottom`. 9299 9300 * We remove the opening delimiter. 9301 9302 * If we have a link (and not an image), we also set all 9303 `[` delimiters before the opening delimiter to *inactive*. (This 9304 will prevent us from getting links within links.) 9305 9306 #### *process emphasis* 9307 9308 Parameter `stack_bottom` sets a lower bound to how far we 9309 descend in the [delimiter stack]. If it is NULL, we can 9310 go all the way to the bottom. Otherwise, we stop before 9311 visiting `stack_bottom`. 9312 9313 Let `current_position` point to the element on the [delimiter stack] 9314 just above `stack_bottom` (or the first element if `stack_bottom` 9315 is NULL). 9316 9317 We keep track of the `openers_bottom` for each delimiter 9318 type (`*`, `_`). Initialize this to `stack_bottom`. 9319 9320 Then we repeat the following until we run out of potential 9321 closers: 9322 9323 - Move `current_position` forward in the delimiter stack (if needed) 9324 until we find the first potential closer with delimiter `*` or `_`. 9325 (This will be the potential closer closest 9326 to the beginning of the input -- the first one in parse order.) 9327 9328 - Now, look back in the stack (staying above `stack_bottom` and 9329 the `openers_bottom` for this delimiter type) for the 9330 first matching potential opener ("matching" means same delimiter). 9331 9332 - If one is found: 9333 9334 + Figure out whether we have emphasis or strong emphasis: 9335 if both closer and opener spans have length >= 2, we have 9336 strong, otherwise regular. 9337 9338 + Insert an emph or strong emph node accordingly, after 9339 the text node corresponding to the opener. 9340 9341 + Remove any delimiters between the opener and closer from 9342 the delimiter stack. 9343 9344 + Remove 1 (for regular emph) or 2 (for strong emph) delimiters 9345 from the opening and closing text nodes. If they become empty 9346 as a result, remove them and remove the corresponding element 9347 of the delimiter stack. If the closing node is removed, reset 9348 `current_position` to the next element in the stack. 9349 9350 - If none in found: 9351 9352 + Set `openers_bottom` to the element before `current_position`. 9353 (We know that there are no openers for this kind of closer up to and 9354 including this point, so this puts a lower bound on future searches.) 9355 9356 + If the closer at `current_position` is not a potential opener, 9357 remove it from the delimiter stack (since we know it can't 9358 be a closer either). 9359 9360 + Advance `current_position` to the next element in the stack. 9361 9362 After we're done, we remove all delimiters above `stack_bottom` from the 9363 delimiter stack. 9364