Compare commits

...

333 commits

Author SHA1 Message Date
Yusuke Inuzuka
04410ff159
Merge pull request #487 from n-peugnet/patch-1
Fix GitHub actions badge URL
2025-02-19 03:38:00 +09:00
Yusuke Inuzuka
f7b2d24c09
Merge pull request #489 from lyricat/master
chore: update `goldmark-enclave` repo URL and information
2025-02-19 03:37:26 +09:00
Lyric Wai
ba49c5c69d
Merge pull request #1 from lyricat/lyricat-patch-enclave
Update README.md
2025-02-18 23:34:09 +09:00
Lyric Wai
c05fb087a4
Update README.md
Update `goldmark-enclave` url and information
2025-02-18 23:33:14 +09:00
Nicolas Peugnet
b39daae79e
Fix GitHub actions badge URL 2025-02-02 15:58:36 +01:00
yuin
d9c03f07f0 Deprecate Node.Text
Node.Text was intended to get a text value from some inline nodes.
A 'text value' of a Text node is clear.

But

- BaseNode had a default implementation of Node.Text
- Lacks of GoDoc description that Node.Text is valid only for
  some inline nodes

So, some users are using Node.Text for BlockNodes.

A 'text value' for a BlockNode is not clear.

e.g. : Text value of a ListNode

- It should be contains list markers?
- What do characters concatinate List items with? newlines? spaces?
- If it contains codeblocks, codeblocks should be fenced or indented?

Now we would like to avoid such ambiguous method.
2024-10-16 20:47:35 +09:00
yuin
65dcf6cd0a Add warning to Node.Text GoDoc 2024-10-15 19:22:00 +09:00
yuin
ad1565131a Fix #470 2024-10-15 19:19:41 +09:00
yuin
bc993b4f59 Fix testcases 2024-10-12 23:12:47 +09:00
yuin
41273a4d07 Fix EOF rendering 2024-10-12 23:05:37 +09:00
Yusuke Inuzuka
d80ac9397c
Merge pull request #462 from Andrew-Morozko/table_fix
Fix panic in table parser
2024-10-12 22:46:35 +09:00
yuin
15000ac6a1 Fix lint errors 2024-10-12 22:42:46 +09:00
Yusuke Inuzuka
14d91f957f
Merge pull request #432 from dr2chase/master
update unsafe code to user newer-faster-better idioms
2024-10-12 22:42:18 +09:00
yuin
3847ca20c6 lazy initialize html5entities 2024-10-12 22:37:57 +09:00
yuin
697e44ce88 Fix #464, Fix #465 2024-10-12 22:18:27 +09:00
yuin
fa88006eee Fix #466 2024-10-12 19:17:05 +09:00
Yusuke Inuzuka
fe34ea5d96
Merge pull request #460 from cbednarski/b-ast-block-text
Add Text() retrieval for BaseBlock types
2024-10-12 18:24:34 +09:00
Andrew Morozko
fd14edc9bc
Fix panic in table parser 2024-08-15 23:38:43 +04:00
Chris Bednarski
e367755421 Added test for BaseBlock.Text retrieval 2024-07-25 23:40:45 -07:00
Chris Bednarski
e44645afbb Implement Text interface for BaseBlock
This implementation was missing, making it impossible to retrieve Text
from block types, such as CodeBlock and FencedCodeBlock, via the ast
interface.
2024-07-25 22:37:54 -07:00
yuin
15ade8aace Fixes #457 2024-06-25 23:29:29 +09:00
yuin
dc32f35808 Fix lint errors 2024-06-23 22:09:11 +09:00
yuin
25bdeb0fee Fixes #456 2024-06-23 21:46:17 +09:00
yuin
a590622b15 Fixes #455 2024-06-14 22:05:13 +09:00
Yusuke Inuzuka
fde4948b4d
Merge pull request #455 from camdencheek/support-single-tilde-strikethrough
GitHub flavored markdown: support single-tilde strikethrough
2024-06-14 21:26:16 +09:00
Camden Cheek
9c09ae0019
support single-tilde strikethrough 2024-06-11 11:16:22 -06:00
Yusuke Inuzuka
c15e394c27
Merge pull request #448 from movsb/fix-attribute-string
make RenderAttributes() accept both []byte and string
2024-04-03 18:46:06 +09:00
movsb
e405d57be0 make SetAttributeString() accept both []byte and string 2024-04-02 20:00:37 +08:00
Yusuke Inuzuka
ce6424aa0e
Merge pull request #446 from mr-chelyshkin/goldmark-tgmd
add link to goldmark-tgmd renderer
2024-03-24 21:08:58 +09:00
mr-chelyshkin
09afa2feba add link to goldmark-tgmd renderer 2024-03-23 08:12:08 +02:00
Yusuke Inuzuka
4f3074451e
Merge pull request #443 from philipparndt/patch-1
docu: update example as it will not build
2024-02-29 14:35:50 +09:00
Philipp Arndt
4675c66d3d
docu: update example as it will not build 2024-02-19 07:56:40 +01:00
yuin
848dc66530 Add playground link 2024-02-03 20:12:14 +09:00
yuin
b8d6d3a9b7 Bump up CommonMark Spec to 0.31.2 2024-02-02 21:13:09 +09:00
yuin
90c46e0829 Remove io/ioutil s 2024-01-23 22:45:14 +09:00
yuin
4bade05173 Drop Go1.18 support 2024-01-23 22:41:12 +09:00
David Chase
2b845f2615 update unsafe code to user newer-faster-better idioms 2023-11-28 11:16:10 -05:00
Yusuke Inuzuka
e3d8ed9725
Merge pull request #429 from movsb/extension-wiki-table
Add extension: wiki-table
2023-11-21 19:16:40 +09:00
movsb
697cd509b1 Add extension: wiki-table 2023-11-21 08:34:34 +08:00
Yusuke Inuzuka
ff3285aa2a
Merge pull request #427 from lyricat/patch-1
README/extensions: Add goldmark-enclave
2023-11-20 18:09:01 +09:00
Lyric Wai
c2167685c1
Add an extension, goldmark-enclave 2023-11-17 19:47:10 +09:00
yuin
39a50c623e Add goldmark-dynamic 2023-11-03 21:27:34 +09:00
yuin
9c9003363f Simplify EastAsianLineBreaks 2023-10-28 17:57:55 +09:00
Yusuke Inuzuka
a89ad04c49
Merge pull request #411 from henry0312/update_cond_east_asian_line_breaks
Define line break styles for east asian characters as options
2023-10-28 17:27:21 +09:00
OMOTO Tsukasa
6b3067e7e7 Implements CSS3Draft 2023-10-24 21:54:35 +09:00
yuin
6442ae1259 Fix #416 2023-10-14 18:02:09 +09:00
Yusuke Inuzuka
68e53654f2
Merge pull request #419 from roife/master
Fix #418
2023-10-08 21:43:29 +09:00
roife
04d4dd50ab Fix #418 2023-09-29 02:41:10 +08:00
OMOTO Tsukasa
8c6830d73b fix errors of lints 2023-09-24 15:07:17 +09:00
OMOTO Tsukasa
792af6819e Updat README.md 2023-09-24 14:25:34 +09:00
OMOTO Tsukasa
9d0b1b6bb8 Define EastAsianLineBreaksStyle to specify behavior of line breaking 2023-09-24 14:25:28 +09:00
OMOTO Tsukasa
dc2230c235 fix tests 2023-09-10 18:48:44 +09:00
OMOTO Tsukasa
2367b9ff46 add comments 2023-09-10 15:17:16 +09:00
OMOTO Tsukasa
6cbcfebb71 Add a WorksEvenWithOneSide option to EastAsianLineBreak 2023-09-10 15:08:57 +09:00
OMOTO Tsukasa
6ef9b10a3a Improve line breaking behavior for east asian characters
This commit aims to produce more natural line breaks in the rendered output.
2023-08-27 15:13:49 +09:00
yuin
d39ab8f93e Runs linters only if on the linux 2023-08-15 18:53:16 +09:00
yuin
9b02182dd0 Apply linters 2023-08-15 18:40:41 +09:00
Yusuke Inuzuka
ac56543632
Merge pull request #409 from henry0312/support_cjk_symbols_and_punctuation
Add support for CJK Symbols and Punctuation
2023-08-13 22:26:44 +09:00
OMOTO Tsukasa
2f1b40d881 Support CJK Symbols and Punctuation
This commit adds support of CJK Symbols and Punctuation to `func IsEastAsianWideRune`
2023-08-13 13:45:19 +09:00
yuin
254b9f8f77 Fix #403, Fix #406 2023-07-23 22:07:27 +09:00
yuin
31ccfc4039 update benchmark 2023-06-17 19:36:11 +09:00
Yusuke Inuzuka
b2df67847e
Merge pull request #353 from silvergasp/master
fuzz: Make goldmark compatible with OSS-Fuzz
2023-05-30 17:13:45 +09:00
Yusuke Inuzuka
4536092b45
Merge pull request #337 from piggynl/test-timeout-multiplier
extra_test.go: Add test timeout multiplier environment variable
2023-05-30 17:08:42 +09:00
yuin
2ce4086884 move to stale actions 2023-05-29 14:52:23 +09:00
Yusuke Inuzuka
d9067d2324
Merge pull request #391 from yilei/goldmark-figure
Adds goldmark-figure to the extension list.
2023-05-08 14:31:54 +09:00
Yilei Yang
69a0811de5 Shorten the description. 2023-04-25 19:48:48 -07:00
Yilei Yang
e46b1b5305 Adds goldmark-figure to the extension list. 2023-04-25 19:47:22 -07:00
Yusuke Inuzuka
8e2127faa4
Merge pull request #350 from ohanan/master
feat(reader): work for cjk in findSubMatchReader
2023-04-06 17:14:12 +09:00
Yusuke Inuzuka
c6f0e7e0c5
Merge pull request #380 from abhinav/frontmatter-ext
README/extensions: Add goldmark-frontmatter
2023-03-21 20:55:02 +09:00
Yusuke Inuzuka
d47eaf8c2d
Merge pull request #379 from tenkoh/update/readme
Update README.md - add extension
2023-03-21 20:54:48 +09:00
Abhinav Gupta
078e8aa5bb README/extensions: Add goldmark-frontmatter
Adds [goldmark-frontmatter](https://github.com/abhinav/goldmark-frontmatter)
to the extension list.
2023-03-16 13:51:36 -07:00
tenkoh
42f759fbe6
Update README.md - add extension
Add extension `goldmark-img64` to support embedding images as DataURL.
2023-03-13 12:49:02 +09:00
Yusuke Inuzuka
9d6f314b99
Merge pull request #363 from abhinav/patch-1
README/extensions: Add goldmark-anchor
2023-02-08 05:23:17 +09:00
Abhinav Gupta
3505399723
README/extensions: Add goldmark-anchor
Adds [goldmark-anchor](https://github.com/abhinav/goldmark-anchor) to the extension list.
2023-02-04 15:48:41 -08:00
yuin
82ff890073 Fix #361 2023-02-02 21:02:21 +09:00
Yusuke Inuzuka
550b0c7ca0
Merge pull request #358 from jmooring/issue-357
Add role to global HTML attributes
2023-01-30 22:58:31 +09:00
Joe Mooring
c446c414ef Add role to global HTML attributes
Closes #357
2023-01-27 11:07:07 -08:00
Nathaniel Brough
5e78751e90 fuzz: Make goldmark compatible with OSS-Fuzz
This set of changes will make goldmark compatible with the fuzzing
requirements for google/oss-fuzz, a continuous fuzzing service.
To ensure compatibility this change adds;
- A seperate fuzzing target that doesn't use the builtin fuzz corpus
- A tool to convert the builtin fuzz corpus to a zip file corpus
  that is compatible with oss-fuzz
2023-01-19 12:16:00 -08:00
lihuanan
023c1d9d21 feat(reader): work for cjk in findSubMatchReader
Change-Id: Id30a946c4e78e7cdaf0fa7af9300b98f754e5b80
2023-01-09 17:30:16 +08:00
Yusuke Inuzuka
60df4aadee
Merge pull request #344 from hjr265/patch-1
Adding two extensions to README.md
2022-12-26 03:30:49 +09:00
Mahmud Ridwan
a0962b569b
Update README.md 2022-12-25 13:17:07 +06:00
yuin
a87c5778f9 Fix #333 2022-11-12 20:13:03 +09:00
Piggy NL
b1ef69a870 Add test timeout multiplier environment variable 2022-11-11 17:55:35 +08:00
Yusuke Inuzuka
aaeb9851a6
Merge pull request #332 from stefanfritsch/feature-fences
Add link to goldmark-fences
2022-10-29 21:08:18 +09:00
stefanfritsch
34756f289f
Add link to goldmark-fences 2022-10-24 22:22:32 +02:00
yuin
c71a97b837 Fixed bug related newline code 2022-09-25 23:47:23 +09:00
yuin
ae42b9179f Fix bug that escaped space not working with Linkfy extension 2022-09-25 19:39:48 +09:00
yuin
1dd67d5750 Add CJK extension 2022-09-25 03:31:14 +09:00
Yusuke Inuzuka
95efaa1805
Merge pull request #324 from soypat/patch-1
Add goldmark-latex extension to README.md
2022-09-23 20:30:56 +09:00
yuin
a3630e3073 Fix #323 2022-09-19 20:39:17 +09:00
Patricio Whittingslow
3923ba0148
Add goldmark-latex extension to README.md 2022-09-04 18:54:21 -03:00
yuin
20df1691ad Fix #321 2022-09-02 20:08:28 +09:00
yuin
175c5ecd0c Drop go 1.17 support 2022-08-06 19:59:30 +09:00
Yusuke Inuzuka
d77f38c53d
Merge pull request #316 from vincentbernat/fix/svg-mime-type
Fix SVG mime type (image/svg+xml)
2022-07-30 05:12:45 +09:00
Yusuke Inuzuka
42509ee585
Merge pull request #314 from MarkRosemaker/common-sense-NewTableRow
NewTableRow: use given parameter
2022-07-30 05:12:33 +09:00
Vincent Bernat
db82c79f20 Fix SVG mime type (image/svg+xml)
In PR #298, I was using image/svg for the MIME type for SVG image.
This is in fact image/svg+xml. Sorry for that.
2022-07-22 21:26:18 +02:00
Mark Rosemaker
1efd8073c5 Update table.go 2022-07-13 19:38:47 +02:00
yuin
c0856327b3 Fix #313 2022-07-09 16:22:17 +09:00
Yusuke Inuzuka
8c219a7562
Merge pull request #304 from KevSlashNull/patch-1
Fix typo in README extension list
2022-04-30 19:11:51 +09:00
Kev
41e371a2d6
Fix typo in README extension list 2022-04-26 22:44:12 +02:00
yuin
113ae87dd9 Fix #258 2022-04-23 22:07:33 +09:00
yuin
fb6057bf26 Fix #256 2022-04-23 21:57:38 +09:00
Yusuke Inuzuka
6bda32624d
Merge pull request #298 from vincentbernat/fix/svg-avif-safe
renderer: image/svg is also safe
2022-04-14 19:57:26 +09:00
yuin
6bdcc0f927 Fix #300 2022-04-14 15:10:12 +09:00
Vincent Bernat
fc877ab357 renderer: image/svg is also safe
When used in an image context, SVG cannot execute scripts, be styled
or fetch additional resources. So, they are as safe as other formats.

When used in a `<a>` tag, it could starts executing JS, but only in
its own context, so it shouldn't be dangerous either.

This raises the question on which image format is dangerous.
2022-04-03 23:34:20 +02:00
Yusuke Inuzuka
e29c1a5dfa
Merge pull request #294 from PaperPrototype/patch-1
Added goldmark-embed to Extensions
2022-03-28 20:40:00 +09:00
Yusuke Inuzuka
96e98dca10
Merge pull request #293 from zhsj/short-test
Skip performance test in short mode
2022-03-28 20:39:48 +09:00
Abdiel Lopez
84ce6bc8ba
Added goldmark-embed to Extensions
Adds support for rendering embeds from YouTube links.
2022-03-26 11:56:57 -04:00
Shengjing Zhu
f28136bf1c Skip performance test in short mode
When packaging this library for Debian, some CI servers are slow
to run performance tests.
2022-03-23 16:07:28 +08:00
yuin
6d6707526e Drop go 1.16 support 2022-03-20 18:31:47 +09:00
Yusuke Inuzuka
31179abc2d
Merge pull request #288 from jchenry/patch-1
Update README.md
2022-03-20 17:57:49 +09:00
Yusuke Inuzuka
2184cbf49b
Merge pull request #292 from jmooring/add-global-attributes
Add global HTML attributes
2022-03-20 17:57:33 +09:00
Yusuke Inuzuka
67340c7d10
Merge branch 'master' into add-global-attributes 2022-03-20 17:57:16 +09:00
Yusuke Inuzuka
5ffbecc0c5
Merge pull request #290 from jmooring/remove-obsolete-global-attributes
Remove obsolete global HTML attributes
2022-03-20 17:55:58 +09:00
Joe Mooring
8da81dfae2 Add global HTML attributes
Closes #291
2022-03-18 12:39:30 -07:00
Joe Mooring
3563ceb58e Remove obsolete global attributes
Closes #289
2022-03-18 12:01:33 -07:00
Colin Henry
f04f7f5340
Update README.md
added link to goldmark-pikchr
2022-03-16 21:16:21 -07:00
yuin
e64a68fc13 Fix #287 2022-03-12 17:18:59 +09:00
yuin
0af271269f Fix #285 2022-03-09 04:01:30 +09:00
yuin
a816d4652e Fix tests for Go1.16 2022-03-05 18:56:34 +09:00
yuin
920c3818d4 Improve raw html parsing performance 2022-03-05 18:45:57 +09:00
yuin
be2bf82af9 Fix performance problems related link labels 2022-03-05 17:37:40 +09:00
yuin
5ba3327fda Fix fuzzer 2022-02-19 18:32:12 +09:00
Yusuke Inuzuka
1def545b06
Merge pull request #280 from natemoo-re/fix/typographer
Improve Typographer Edge Cases
2022-02-19 04:53:02 +09:00
Nate Moore
7aa0ead60c remove log 2022-02-17 10:13:49 -06:00
Nate Moore
5e24a62400 add comment to clarify case 2022-02-17 10:13:18 -06:00
Nate Moore
a6c48071ed fix typographer leading apostrophe for abbreviations 2022-02-17 10:04:21 -06:00
Nate Moore
7b616a4c80 fix typographer single quote edge cases 2022-02-17 09:41:29 -06:00
Nate Moore
c3b7691431 add failing typographer tests 2022-02-16 22:27:33 -06:00
yuin
b7b0919dfe Add AddMeta 2022-02-12 19:01:55 +09:00
yuin
41b1d4542d Update benchmark 2022-02-08 19:24:52 +09:00
yuin
f6e93ffd8f Fix #274, Fix #275 2022-02-08 17:15:15 +09:00
yuin
d44b18596f Update unicode case foldings(Unicode 12.1.0 -> 14.0.0) 2021-11-24 21:33:54 +09:00
yuin
333aa4d779 Fixes #263 2021-11-14 19:49:34 +09:00
yuin
907eb99835 Fixes #262 2021-11-08 16:13:54 +09:00
yuin
0283c9c543 Improve benchmarks with WSL2 2021-10-27 18:30:14 +09:00
yuin
4477be72ee Fix #235 2021-10-24 17:49:45 +09:00
yuin
b2c88c80f6 #248 - 10 2021-10-23 20:17:35 +09:00
yuin
661ccb7c9e Drop Go 1.15 support 2021-10-21 18:27:53 +09:00
yuin
7557842636 #248 - 13 2021-10-21 18:24:30 +09:00
yuin
15ea97611d #248 - 12 2021-10-20 18:07:23 +09:00
yuin
4b793a1aed #248 - 11 2021-10-17 19:48:50 +09:00
yuin
324b2d6e6f #248 - 9 2021-10-17 19:28:38 +09:00
yuin
829dc0ae24 #248 - 8 2021-10-17 19:21:23 +09:00
yuin
e77ca9231a #248 - 8 2021-10-17 17:46:05 +09:00
yuin
05d89a0b45 #248 - 7 2021-10-17 17:29:01 +09:00
yuin
beafde4b8f #248 - 6 2021-10-16 20:12:16 +09:00
yuin
cbaee30aee #248 - 5 2021-10-16 19:55:09 +09:00
yuin
20a276ea45 #248 - 4 2021-10-16 19:39:51 +09:00
yuin
bc90544cef #248 - 3 2021-10-16 18:56:49 +09:00
yuin
f5dcbd8208 #248 - 2 2021-10-16 18:48:59 +09:00
yuin
2f8abf5949 #248 - 1 2021-10-16 17:47:30 +09:00
yuin
829d874034 Fix #237 2021-09-12 18:21:50 +09:00
yuin
d44652d174 Fix #245 - 11 2021-09-11 12:46:34 +09:00
yuin
97df31c2ec Fix #245 - 10 2021-09-11 12:44:01 +09:00
yuin
a8ed3c4205 Fix #245 - 9 2021-09-11 12:31:18 +09:00
yuin
457c157ed5 Fix #245 - 8 2021-09-11 12:25:17 +09:00
yuin
4317d98509 Fix #245 - 7 2021-09-11 11:57:36 +09:00
yuin
1306649a65 Fix #245 - 6 2021-09-11 11:51:43 +09:00
yuin
7efc483c26 Fix #245 - 5 2021-09-11 11:45:00 +09:00
yuin
351308fb72 Fix #245 - 4 2021-09-11 11:20:50 +09:00
yuin
f37563cfa8 Add options for test cases 2021-09-11 11:08:36 +09:00
yuin
466482bf19 Fix #245 - 3 2021-09-11 09:55:07 +09:00
yuin
fad80b4f0c Fix #245 - 2 2021-09-11 09:48:43 +09:00
yuin
8174177880 Fix #245 - 1 2021-09-11 09:44:22 +09:00
Yusuke Inuzuka
aae0486a04
Merge pull request #241 from abhinav/patch-1
extensions: Add hashtag, wikilink, toc, mermaid
2021-08-27 04:22:37 +09:00
Abhinav Gupta
1d522188b9
extensions: Add hashtag, wikilink, toc, mermaid
Add new community extensions to the README.
2021-08-26 05:14:25 -07:00
yuin
5588d92a56 Support CommonMark 0.30 2021-07-05 00:12:46 +09:00
yuin
759cc35c3a Fix #229 2021-06-28 08:05:35 +09:00
yuin
040b478cdb Fixes #223 2021-06-17 19:07:05 +09:00
yuin
38f7fc92ff Fixes #219 2021-05-16 11:03:43 +09:00
Yusuke Inuzuka
43353aeea4
Merge pull request #221 from stephenafamo/soft-line-break
Fix logic for soft line breaks
2021-05-16 22:25:13 +09:00
yuin
16e27ac471 Add DiffPretty 2021-05-15 19:42:58 +09:00
Stephen Afam-Osemene
9568b57c4b fix logic for soft line breaks 2021-05-14 23:30:29 +02:00
Yusuke Inuzuka
155754ef6e
Merge pull request #214 from stephenafamo/patch-1
Add link to goldmark-pdf in README
2021-04-21 14:39:53 +09:00
Stephen Afam-Osemene
ab9c801417
Add link to goldmark-pdf in README 2021-04-20 21:10:27 +02:00
yuin
db22606847 Drop go1.14 support 2021-04-18 18:08:20 +09:00
yuin
9f5125e104 Remove debug print 2021-04-14 02:01:54 +09:00
Yusuke Inuzuka
baec0941d2
Merge pull request #205 from brief/fix-footnote-backlinks
Keep backlinks with footnote text
2021-04-11 00:04:38 +09:00
Yusuke Inuzuka
cd69b03a4e
Merge pull request #207 from jmooring/issue-206
Add "type" to the list attribute filter
2021-04-11 00:04:18 +09:00
Yusuke Inuzuka
24d1a350c2
Merge pull request #210 from eltociear/patch-1
Fix typo in README.md
2021-04-11 00:04:01 +09:00
yuin
ab798ea4aa Fixes #213 2021-04-11 00:00:44 +09:00
Ikko Ashimine
304f513394
Fix typo in README.md
Occurances -> Occurrences
2021-04-02 01:40:26 +09:00
Joe Mooring
3fcf875f9f Add "type" to the list attribute filter
Closes #206
2021-03-31 12:47:01 -07:00
brief
017596e61e Keep backlinks with footnote text
Inserts a non-breaking space before backlinks to prevent them from being separated from footnote text.
2021-03-28 08:39:43 -07:00
Yusuke Inuzuka
75d8cce5b7
Merge pull request #188 from karelbilek/fix-nested-codeblock-empty
Fix empty line detection in markdown in list
2021-03-26 20:41:09 +09:00
yuin
2913ca2902 Fixes a bug related to escaped pipes in a table cell 2021-03-20 17:07:23 +09:00
Karel Bilek
c53c1a4cfe Fix empty line detection in markdown in list
Fix for independent issue in #177

The next parser should have the whitespace removed, even when it's blank
2021-02-08 15:06:31 +07:00
yuin
56bbdf0370 refactoring 2021-02-07 20:36:39 +09:00
yuin
2ffadcefcf Fixes: #177 2021-02-07 19:46:55 +09:00
Yusuke Inuzuka
f3e20f4795
Merge pull request #191 from moorereason/iss190
Fix Linkify to allow host with beginning single-letter domain label
2021-02-07 17:25:11 +09:00
Cameron Moore
3d7ce16f2f
Fix Linkify to allow host with beginning single-letter domain label
Fixes #190
2021-01-30 08:48:27 -06:00
Yusuke Inuzuka
c4b3054802
Merge pull request #183 from hochhaus/master
Use canonical blackfriday v2 import path
2021-01-29 16:52:55 +09:00
Yusuke Inuzuka
4422fe4f7b
Merge pull request #184 from hooligani/patch-1
Update README.md
2021-01-29 16:52:34 +09:00
hooligani
3549ad5b05
Update README.md
Strict is not a function
2021-01-11 21:54:08 +02:00
Andy Hochhaus
e7353e6299 Use canonical blackfriday v2 import path
https://github.com/russross/blackfriday/issues/587
2021-01-10 14:42:39 -06:00
yuin
6c741ae251 Fixes #176 2020-12-26 18:09:14 +09:00
Yusuke Inuzuka
748fc079dc
Merge pull request #172 from moorereason/typos
Fix typos in godocs comments
2020-12-26 18:02:24 +09:00
Yusuke Inuzuka
033db7dd82
Merge pull request #171 from moorereason/unused-buf
Remove unnecessary buffer
2020-12-26 18:01:51 +09:00
Yusuke Inuzuka
4e0c22a20e
Merge pull request #170 from moorereason/deadcode
Remove unused global code
2020-12-26 18:01:11 +09:00
Yusuke Inuzuka
d64a5a61f8
Merge pull request #169 from moorereason/pkg.go.dev
Update package documentation badge in README
2020-12-26 18:00:52 +09:00
yuin
036c8738df Fixes #164, Fixes #167 2020-12-26 17:56:42 +09:00
Cameron Moore
5e417f871d Fix typos in godocs comments 2020-12-17 10:51:19 -06:00
Cameron Moore
fa0cf89934 Remove unnecessary buffer
Remove buf, which appears to be created and appended to but never used
for anything.  Found with staticcheck check SA4010 ("The result of
append will never be observed anywhere").
2020-12-17 10:30:44 -06:00
Cameron Moore
af880df797 Remove unused global code 2020-12-17 10:26:46 -06:00
Cameron Moore
ccd1cd6819 Update package documentation badge in README
godoc.org is deprecated and will begin redirecting to pkg.go.dev in
early 2021.  Update link and badge to point to pkg.go.dev

See https://blog.golang.org/godoc.org-redirect for more information.
2020-12-17 10:19:11 -06:00
yuin
9e0189df27 Closes #161
- Implement footnote configurations defined in original markdown extra.
- Add OwnerDocument() method to ast.Node
- Add Meta() method to *ast.Document
2020-12-13 23:11:07 +09:00
Yusuke Inuzuka
da9729d134
Merge pull request #155 from helfper/linkify-url-with-port
Support URLs with port in Linkify
2020-10-03 03:28:33 +09:00
Helder Pereira
6de424ee3b Support URLs with port in Linkify 2020-09-19 14:48:35 +01:00
yuin
7b90f04af4 Refactoring 2020-08-26 20:22:51 +09:00
yuin
d102aef53c Add goldmark-emoji extension 2020-08-22 23:50:59 +09:00
yuin
e880f8545b Bump go version up 2020-08-20 13:59:16 +09:00
yuin
91e5269fb0 Fixes #148 2020-07-30 22:33:42 +09:00
yuin
bd58441cc1 Fixes #78 2020-07-21 19:32:52 +09:00
Yusuke Inuzuka
b7f25d6cd9
Merge pull request #145 from jsteuer/master
html escape img alt attribute
2020-07-13 17:13:30 +09:00
jsteuer
b91c802b8c html escape img alt attribute 2020-07-11 23:54:26 +02:00
Yusuke Inuzuka
b24f9b4dd7
Merge pull request #139 from yaaax/patch-1
Fix typo
2020-07-10 17:19:40 +09:00
yuin
3c3d4481ef Fixes #141 2020-07-02 15:44:34 +09:00
yuin
feff0bb82b Add case=x,x.. aruguments for tests 2020-07-02 15:28:33 +09:00
Yaacov
e331790a2b
Fix typo 2020-06-23 01:52:39 +02:00
yuin
33089e232f Improve typographer 2020-06-04 15:26:24 +09:00
yuin
4a49786b16 Fixes #136 2020-06-04 15:26:24 +09:00
Yusuke Inuzuka
6eb3819f4b
Merge pull request #134 from jlauinger/fix-unsafe-slice-cast
Fix possible memory confusion in unsafe slice cast
2020-06-04 00:26:44 +09:00
Johannes Lauinger
2141537561
update util_unsafe.go to use simpler, yet still safe, cast 2020-06-03 15:45:44 +02:00
Johannes Lauinger
813d953eeb
fix possible memory confusion in unsafe slice cast 2020-05-31 21:32:17 +02:00
yuin
3d78558cf2 Test case now can have a description 2020-05-22 18:42:44 +09:00
yuin
4709d43b81 Fixes #127 2020-05-22 18:39:02 +09:00
yuin
3393022ab6 Fixes #39, Fixes #127, Fixes #128, Fixes #129 2020-05-21 18:13:20 +09:00
yuin
a302193b06 Closes #124 2020-04-19 15:58:14 +09:00
yuin
184f8ef622 Fixes #123 2020-04-16 11:34:53 +09:00
yuin
1f967da77c remove debug print 2020-04-16 00:29:11 +09:00
yuin
66874b397f Fixes #122 2020-04-15 11:16:46 +09:00
yuin
54b1e988cc Fixes #121 2020-04-13 18:44:57 +09:00
yuin
12fc98ebcd Fixes #120 2020-04-01 02:10:28 +09:00
yuin
84d07d567d Bump up go version 2020-03-25 16:54:13 +09:00
yuin
785b85a76a Fixes #115 2020-03-25 15:39:51 +09:00
Yusuke Inuzuka
ff84cd3a51
Add caution with Go 1.14 2020-03-13 02:07:59 +09:00
yuin
2bfdf48a98 Fix invalid handling of settext in lists 2020-03-10 16:31:55 +09:00
Yusuke Inuzuka
fea52e86ab
Merge pull request #112 from cipherboy/checkbox-space
Insert space after task-list checkboxes
2020-03-08 17:56:39 +09:00
yuin
5ab7c64e28 Fixes #111 2020-03-08 17:55:02 +09:00
Alexander Scheel
5c877c8afe
Insert space after task-list checkboxes
According to the GFM spec (and per other implementations), a space
should be inserted after the checkbox input element. This gives visual
separation between the checkbox and its element. Update code and tests
to match.

Signed-off-by: Alexander Scheel <alexander.m.scheel@gmail.com>
2020-03-07 21:14:45 -05:00
yuin
21437947a3 Fixes #104 : Invalid precending character breaks emphasis 2020-03-01 18:41:50 +09:00
yuin
a727b5adb2 Fixes #103 2020-02-18 21:18:57 +09:00
Yusuke Inuzuka
4a200405d7
Merge pull request #102 from jsteuer/master
Allow inline links with empty link text
2020-02-17 11:34:05 +09:00
jsteuer
3a828e641d move test to its correct place... 2020-02-16 18:07:23 +01:00
jsteuer
2a86c1ea31 allow inline links with empty link text. see #101 2020-02-16 17:56:09 +01:00
Yusuke Inuzuka
70a5404a11
Merge pull request #99 from pzl/typos
Small documentation typos
2020-02-15 14:06:21 +09:00
pzl
8bdab9449a
documentation typo fixes 2020-02-14 22:03:49 -05:00
Yusuke Inuzuka
0b49730177
Merge pull request #98 from jschaf/goldmark-97
Fix ast.Walk to respect WalkStop
2020-02-09 18:51:13 +09:00
Joe Schafer
3c340e9970 Fix ast.Walk to respect WalkStop
Fixes #97
2020-02-07 01:42:40 -08:00
yuin
39db45a099 Fixes #94 2020-01-27 14:56:49 +09:00
Yusuke Inuzuka
faaafa55b6
Merge pull request #88 from tangiel/apostrophe
typographer: add basic support for apostrophes
2020-01-11 17:52:47 +09:00
Daniel Tang
45fa0e2645 typographer: add basic support for apostrophes 2020-01-09 15:35:41 -08:00
Yusuke Inuzuka
5294fa5c4f
Merge pull request #87 from qiuzhiqian/master
fix bug: BaseNode.RemoveChildren only set the first child as nil
2020-01-10 00:13:16 +09:00
xml
4fc27178e4 fix bug: if do c.SetNextSibling(nil) and c = c.NextSibling(). After remove the first child the var c is always nil.the next child will not be set nil 2020-01-09 17:33:21 +08:00
Yusuke Inuzuka
3e38e966f6
Update ISSUE_TEMPLATE.md 2020-01-06 19:18:03 +09:00
Yusuke Inuzuka
97405beafd
Merge pull request #85 from hivehand/text-tweaks
Tweak README.md punctuation, grammar, and phrasing
2020-01-04 19:08:27 +09:00
Dan Martinez
4fedb553e3 Tweak README.md punctuation, grammar, and phrasing 2020-01-03 17:50:00 -08:00
Yusuke Inuzuka
4f96330d3f
Merge pull request #84 from Kissaki/patch-1
Link to CommonMark website in README.md
2020-01-02 01:55:28 +09:00
yuin
533a448c5f Update README.md 2020-01-02 01:54:01 +09:00
yuin
94947895e7 Update README.md 2020-01-02 01:51:36 +09:00
yuin
83e7dcebc7 Update README.md 2020-01-02 01:50:19 +09:00
yuin
4b54582dee Fixes #83, Adds options for Linkify 2020-01-02 01:36:46 +09:00
Jan Klass
471456b80a Link to CommonMark website in README.md
Link to the CommonMark website to explicitly reference what it is, and to help users unfamiliar with it to discover and navigate to it.

CommonMark is mentioned very early in the README. Here it is a subheadline. I decided to not link here but in this features section instead where it is listed as one of the features. The subheadline seems more fitting as a short, pure text for those knowing about it.

The link is to the main website rather than the specification as users unfamiliar with it will want to land on the landing page with a description of what it is rather than a technical specification. The specification page does not immediately and obviously link back to the main website.

The shortened term spec is being extended to its full word specification for clarity and discoverability.
2019-12-31 10:14:18 +01:00
yuin
f9b134f6be Fixes #81 2019-12-27 16:03:11 +09:00
Yusuke Inuzuka
2c9db0c8fa
Merge pull request #78 from zzwx-forks/master
Additional <thead>, <tr>, <th>, <td> attributes render
2019-12-26 12:02:50 +09:00
yuin
66a48f66b8 Fixes linkify regression.
https://github.github.com/gfm/#example-627 says '<' immediately ends an autolink.
But goldmark had continued to autolink with '<' due to recently changes.
2019-12-25 15:07:45 +09:00
zzwx
224bf7d721 Additional attributes render with comments 2019-12-24 19:37:21 -05:00
Yusuke Inuzuka
9a4c7bce92
Update ISSUE_TEMPLATE.md 2019-12-22 23:50:35 +09:00
yuin
923eb97048 Fixes #71 2019-12-22 23:31:22 +09:00
yuin
e4108d55a4 Update issue template 2019-12-22 22:48:41 +09:00
yuin
12851a08ba Fixes #70, #74 2019-12-22 22:27:39 +09:00
Yusuke Inuzuka
efd5960110
Merge pull request #75 from mdigger/master
Put header id to generator
2019-12-22 21:13:15 +09:00
Yusuke Inuzuka
a6724f003f
Merge pull request #72 from zzwx/master
Render table attributes
2019-12-22 21:13:01 +09:00
Dmitry Sedykh
e0a5ff07c2 put ID to generator 2019-12-22 09:46:40 +03:00
zzwx
5690da2615 Render table attributes 2019-12-20 19:52:09 -05:00
yuin
2aab93edb4 Fixes tests 2019-12-20 11:12:24 +09:00
yuin
e0171097bf Fixes #69 2019-12-20 11:11:28 +09:00
yuin
26fb0b56e6 Fixes gohugoio/hugo#6626 2019-12-18 15:46:54 +09:00
yuin
64d4e16bf4 Fixes #65 2019-12-18 11:37:07 +09:00
yuin
6d2e5fddae Fix #64 and some lint warnings 2019-12-16 11:08:45 +09:00
Yusuke Inuzuka
5b25bb53dd
Merge pull request #63 from aletson/master
incorporate remaining changes from yuin/goldmark#52
2019-12-11 11:47:02 +09:00
Allison Letson
26b4b47473
incorporate remaining changes from yuin/goldmark#52 2019-12-10 09:25:03 -05:00
Yusuke Inuzuka
9e55545c8d
Update ISSUE_TEMPLATE.md 2019-12-10 01:33:25 +09:00
yuin
ec246695c5 Fix fail on go1.12 2019-12-08 19:00:20 +09:00
yuin
7d8bee11ca Closes #33 : Now NodeRenderers render attributes 2019-12-08 18:53:01 +09:00
Yusuke Inuzuka
0a62f6ae63
Create stale.yml 2019-12-08 00:45:12 +09:00
Yusuke Inuzuka
a47a029d55
Fix typo 2019-12-06 18:23:52 +09:00
Yusuke Inuzuka
e528cffacf
Update benchmark results 2019-12-06 13:10:48 +09:00
Yusuke Inuzuka
89919744e0
Fix badge URL 2019-12-06 02:05:12 +09:00
yuin
171dbc66a8 Fixes #50 2019-12-05 13:39:55 +09:00
yuin
ea2c6c34ce Fixes #55 2019-12-05 11:27:29 +09:00
Yusuke Inuzuka
66aa5562f7
Merge pull request #54 from moorereason/fuzz-more
Fuzz more code paths
2019-12-05 11:17:50 +09:00
Yusuke Inuzuka
43e0347f6d
Merge pull request #52 from anthonyfok/fix-typos-in-readme
Fix typos in README.md
2019-12-05 10:44:30 +09:00
Cameron Moore
25f82f0a2d Fuzz more code paths 2019-12-04 14:12:40 -06:00
Anthony Fok
30605bf736
Fix typos in README.md 2019-12-04 05:15:50 -07:00
Yusuke Inuzuka
567046a85d
Merge pull request #51 from adiabatic/master
Improve display of footnote backlinks
2019-12-04 18:07:01 +09:00
Nathan Galt
6f9629fb2b Improve display of footnote backlinks
This makes three changes to backlinks:

- Inserts a space before `<a href='…'>↩</a>` to make it look better
- Uses hexadecimal references (`&#x…;`) instead of decimal references for clarity
- Adds U+FE0E VARIATION SELECTOR 15 to the ↩ to suppress emojification on both iOS and Edge on Windows

If variation selector 15 isn't appended to the ↩, then the return arrow will show up as an emoji on iOS and Edge for Windows. The Pandoc project has already run into this quirk as documented on https://github.com/jgm/pandoc/issues/5469 and they've pushed a change for their Markdown processor already.
2019-12-03 19:34:16 -08:00
yuin
511e434efc Fixes #49 2019-12-03 13:31:57 +09:00
yuin
8a50115f03 Remove unsed variables 2019-12-02 18:26:06 +09:00
yuin
5334c63923 Change IDs argumnent 2019-12-02 18:24:22 +09:00
yuin
ac8e225cd3 Fix invalid handling of consecutive code spans 2019-12-02 17:13:37 +09:00
yuin
68dcec6ac4 Closes #46 : Add WithIDs option 2019-12-02 14:38:28 +09:00
yuin
6efe809cde Update issue template 2019-12-02 13:23:33 +09:00
yuin
eb2667632a Fixes #48 2019-12-02 13:17:08 +09:00
Yusuke Inuzuka
615d5706c6
Merge pull request #43 from moorereason/clean-lint
Clean lint
2019-12-02 03:17:56 +09:00
yuin
2f292e5b74 Fixes #44, Fixes #45 2019-12-02 03:10:06 +09:00
Cameron Moore
74e1374f5a Use io.ByteWriter 2019-11-29 13:42:55 -06:00
Cameron Moore
ff066ede82 Fix gofmt issues 2019-11-29 13:39:42 -06:00
Cameron Moore
3dc5ebdb17 Fix golint issues 2019-11-29 13:31:28 -06:00
Cameron Moore
2932dadfb3 Fix gofmt formatting 2019-11-29 13:22:11 -06:00
Cameron Moore
748be0c096 Fix shadow declarations 2019-11-29 13:20:34 -06:00
yuin
9f9f8f0e5e Closes #41 2019-11-30 03:33:46 +09:00
yuin
8549b83b0c Fixes gohugoio/hugo#6549 2019-11-29 17:21:05 +09:00
yuin
54fc7c3f18 Closes #40 2019-11-29 17:03:00 +09:00
yuin
b611cd333a Add issue template 2019-11-28 11:39:10 +09:00
yuin
6c55ba55a1 Fixes #36 2019-11-27 03:00:17 +09:00
yuin
4536e57938 Fixes #35 2019-11-25 20:35:02 +09:00
yuin
fba5de7344 Fixes #34 2019-11-24 20:17:02 +09:00
yuin
9dec7e9e8b Fixes #31 2019-11-20 01:10:18 +09:00
yuin
c8c3b41fe0 Update badge 2019-11-18 16:15:15 +09:00
yuin
c999f5a9a7 Fix workflow
gcov2lcov seems failing to handle coverpkg?
2019-11-18 16:06:33 +09:00
yuin
22bbda3653 Fix workflow 2019-11-18 15:42:59 +09:00
yuin
76006af024 Remove .travis.yml 2019-11-18 15:28:31 +09:00
yuin
d51543d817 Fix workflow 2019-11-18 15:25:17 +09:00
yuin
66a1061d96 Fix workflow 2019-11-18 15:23:32 +09:00
yuin
8aefee4a22 migrate travisCI to Github actions 2019-11-18 15:20:39 +09:00
yuin
696c860a32 Update go version 2019-11-16 21:31:10 +09:00
yuin
afc3654ecf Fixes #28, #29 2019-11-16 21:11:57 +09:00
yuin
ea8789f650 Fixed #27 2019-11-11 03:59:36 +09:00
yuin
16b69522a4 Remove the WithWorkers option
Situations that concurrent inline parsing is effective are very limited
due to goroutine overheads and a parse context sharing mutex.
2019-10-31 17:46:02 +09:00
Yusuke Inuzuka
2184586bb2
Update README.md 2019-08-30 22:21:16 +09:00
yuin
13a98719d4 Fix invalid use of lute 2019-08-30 21:54:02 +09:00
89 changed files with 14807 additions and 6883 deletions

17
.github/ISSUE_TEMPLATE.md vendored Normal file
View file

@ -0,0 +1,17 @@
goldmark has [https://github.com/yuin/goldmark/discussions](Discussions) in github.
You should post only issues here. Feature requests and questions should be posted at discussions.
- [ ] goldmark is fully compliant with the CommonMark. Before submitting issue, you **must** read [CommonMark spec](https://spec.commonmark.org/0.29/) and confirm your output is different from [CommonMark online demo](https://spec.commonmark.org/dingus/).
- [ ] **Extensions(Autolink without `<` `>`, Table, etc) are not part of CommonMark spec.** You should confirm your output is different from other official renderers correspond with an extension.
- [ ] **goldmark is not dedicated for Hugo**. If you are Hugo user and your issue was raised by your experience in Hugo, **you should consider create issue at Hugo repository at first** .
Please answer the following before submitting your issue:
1. What version of goldmark are you using? :
2. What version of Go are you using? :
3. What operating system and processor architecture are you using? :
4. What did you do? :
5. What did you expect to see? :
6. What did you see instead? :
7. Did you confirm your output is different from [CommonMark online demo](https://spec.commonmark.org/dingus/) or other official renderer correspond with an extension?:

26
.github/workflows/stale.yaml vendored Normal file
View file

@ -0,0 +1,26 @@
name: Close inactive issues
on:
schedule:
- cron: "30 9 * * *"
jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v5
with:
days-before-issue-stale: 30
days-before-issue-close: 14
stale-issue-label: "stale"
stale-issue-message: "This issue is stale because it has been open for 30 days with no activity."
close-issue-message: "This issue was closed because it has been inactive for 14 days since being marked as stale."
exempt-issue-labels: "pinned,security"
days-before-pr-stale: 180
days-before-pr-close: 14
stale-pr-label: "stale"
stale-pr-message: "This PR is stale because it has been open for 180 days with no activity."
close-pr-message: "This PR was closed because it has been inactive for 14 days since being marked as stale."
exempt-pr-labels: "pinned,security"
repo-token: ${{ secrets.GITHUB_TOKEN }}

33
.github/workflows/test.yaml vendored Normal file
View file

@ -0,0 +1,33 @@
on: [push, pull_request]
name: test
jobs:
test:
strategy:
fail-fast: false
matrix:
go-version: [1.21.x, 1.22.x]
platform: [ubuntu-latest, macos-latest, windows-latest]
runs-on: ${{ matrix.platform }}
steps:
- name: Install Go
uses: actions/setup-go@v4
with:
go-version: ${{ matrix.go-version }}
- name: Checkout code
uses: actions/checkout@v3
- name: Run lints
uses: golangci/golangci-lint-action@v6
with:
version: latest
if: "matrix.platform == 'ubuntu-latest'" # gofmt linter fails on Windows for CRLF problems
- name: Run tests
env:
GOLDMARK_TEST_TIMEOUT_MULTIPLIER: 5
run: go test -v ./... -covermode=count -coverprofile=coverage.out -coverpkg=./...
- name: Install goveralls
run: go install github.com/mattn/goveralls@latest
- name: Send coverage
if: "matrix.platform == 'ubuntu-latest'"
env:
COVERALLS_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: goveralls -coverprofile=coverage.out -service=github

105
.golangci.yml Normal file
View file

@ -0,0 +1,105 @@
run:
deadline: 10m
issues:
exclude-use-default: false
exclude-rules:
- path: _test.go
linters:
- errcheck
- lll
exclude:
- "Package util"
linters:
disable-all: true
enable:
- errcheck
- gosimple
- govet
- ineffassign
- staticcheck
- typecheck
- unused
- gofmt
- godot
- makezero
- misspell
- revive
- wastedassign
- lll
linters-settings:
revive:
severity: "warning"
confidence: 0.8
rules:
- name: blank-imports
severity: warning
disabled: false
- name: context-as-argument
severity: warning
disabled: false
- name: context-keys-type
severity: warning
disabled: false
- name: dot-imports
severity: warning
disabled: true
- name: error-return
severity: warning
disabled: false
- name: error-strings
severity: warning
disabled: false
- name: error-naming
severity: warning
disabled: false
- name: exported
severity: warning
disabled: false
- name: increment-decrement
severity: warning
disabled: false
- name: var-naming
severity: warning
disabled: false
- name: var-declaration
severity: warning
disabled: false
- name: package-comments
severity: warning
disabled: false
- name: range
severity: warning
disabled: false
- name: receiver-naming
severity: warning
disabled: false
- name: time-naming
severity: warning
disabled: false
- name: unexported-return
severity: warning
disabled: false
- name: indent-error-flow
severity: warning
disabled: false
- name: errorf
severity: warning
disabled: false
- name: empty-block
severity: warning
disabled: true
- name: superfluous-else
severity: warning
disabled: false
- name: unused-parameter
severity: warning
disabled: true
- name: unreachable-code
severity: warning
disabled: false
- name: redefines-builtin-id
severity: warning
disabled: false

View file

@ -1,17 +0,0 @@
language: go
go:
- "1.11.x"
- "1.12.x"
env:
global:
GO111MODULE=off
before_install:
- go get github.com/axw/gocov/gocov
- go get github.com/mattn/goveralls
- if ! go get code.google.com/p/go.tools/cmd/cover; then go get golang.org/x/tools/cmd/cover; fi
install:
- go get -u -v $(go list -f '{{join .Imports "\n"}}{{"\n"}}{{join .TestImports "\n"}}' ./... | sort | uniq | grep '\.' | grep -v goldmark)
script:
- $HOME/gopath/bin/goveralls -service=travis-ci

View file

@ -1,4 +1,7 @@
.PHONY: test fuzz
.PHONY: test fuzz lint
lint:
golangci-lint run -c .golangci.yml ./...
test:
go test -coverprofile=profile.out -coverpkg=github.com/yuin/goldmark,github.com/yuin/goldmark/ast,github.com/yuin/goldmark/extension,github.com/yuin/goldmark/extension/ast,github.com/yuin/goldmark/parser,github.com/yuin/goldmark/renderer,github.com/yuin/goldmark/renderer/html,github.com/yuin/goldmark/text,github.com/yuin/goldmark/util ./...
@ -7,10 +10,4 @@ cov: test
go tool cover -html=profile.out
fuzz:
which go-fuzz > /dev/null 2>&1 || (GO111MODULE=off go get -u github.com/dvyukov/go-fuzz/go-fuzz github.com/dvyukov/go-fuzz/go-fuzz-build; GO111MODULE=off go get -d github.com/dvyukov/go-fuzz-corpus; true)
rm -rf ./fuzz/corpus
rm -rf ./fuzz/crashers
rm -rf ./fuzz/suppressions
rm -f ./fuzz/fuzz-fuzz.zip
cd ./fuzz && go-fuzz-build
cd ./fuzz && go-fuzz
cd ./fuzz && go test -fuzz=Fuzz

520
README.md
View file

@ -1,53 +1,55 @@
goldmark
==========================================
[![http://godoc.org/github.com/yuin/goldmark](https://godoc.org/github.com/yuin/goldmark?status.svg)](http://godoc.org/github.com/yuin/goldmark)
[![https://travis-ci.org/yuin/goldmark](https://travis-ci.org/yuin/goldmark.svg)](https://travis-ci.org/yuin/goldmark)
[![https://coveralls.io/r/yuin/goldmark](https://coveralls.io/repos/yuin/goldmark/badge.svg)](https://coveralls.io/r/yuin/goldmark)
[![https://pkg.go.dev/github.com/yuin/goldmark](https://pkg.go.dev/badge/github.com/yuin/goldmark.svg)](https://pkg.go.dev/github.com/yuin/goldmark)
[![https://github.com/yuin/goldmark/actions?query=workflow:test](https://github.com/yuin/goldmark/actions/workflows/test.yaml/badge.svg?branch=master&event=push)](https://github.com/yuin/goldmark/actions?query=workflow:test)
[![https://coveralls.io/github/yuin/goldmark](https://coveralls.io/repos/github/yuin/goldmark/badge.svg?branch=master)](https://coveralls.io/github/yuin/goldmark)
[![https://goreportcard.com/report/github.com/yuin/goldmark](https://goreportcard.com/badge/github.com/yuin/goldmark)](https://goreportcard.com/report/github.com/yuin/goldmark)
> A markdown parser written in Go. Easy to extend, standard compliant, well structured.
> A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured.
goldmark is compliant to CommonMark 0.29.
goldmark is compliant with CommonMark 0.31.2.
- [goldmark playground](https://yuin.github.io/goldmark/playground/) : Try goldmark online. This playground is built with WASM(5-10MB).
Motivation
----------------------
I need a markdown parser for Go that meets following conditions:
I needed a Markdown parser for Go that satisfies the following requirements:
- Easy to extend.
- Markdown is poor in document expressions compared with other light markup languages like restructuredText.
- We have extended a markdown syntax. i.e. : PHPMarkdownExtra, Github Flavored Markdown.
- Standard compliant.
- Markdown is poor in document expressions compared to other light markup languages such as reStructuredText.
- We have extensions to the Markdown syntax, e.g. PHP Markdown Extra, GitHub Flavored Markdown.
- Standards-compliant.
- Markdown has many dialects.
- Github Flavored Markdown is widely used and it is based on CommonMark aside from whether CommonMark is good specification or not.
- CommonMark is too complicated and hard to implement.
- Well structured.
- AST based, and preserves source position of nodes.
- GitHub-Flavored Markdown is widely used and is based upon CommonMark, effectively mooting the question of whether or not CommonMark is an ideal specification.
- CommonMark is complicated and hard to implement.
- Well-structured.
- AST-based; preserves source position of nodes.
- Written in pure Go.
[golang-commonmark](https://gitlab.com/golang-commonmark/markdown) may be a good choice, but it seems copy of the [markdown-it](https://github.com/markdown-it) .
[golang-commonmark](https://gitlab.com/golang-commonmark/markdown) may be a good choice, but it seems to be a copy of [markdown-it](https://github.com/markdown-it).
[blackfriday.v2](https://github.com/russross/blackfriday/tree/v2) is a fast and widely used implementation, but it is not CommonMark compliant and can not be extended from outside of the package since it's AST is not interfaces but structs.
[blackfriday.v2](https://github.com/russross/blackfriday/tree/v2) is a fast and widely-used implementation, but is not CommonMark-compliant and cannot be extended from outside of the package, since its AST uses structs instead of interfaces.
Furthermore, its behavior differs with other implementations in some cases especially of lists. ([Deep nested lists don't output correctly #329](https://github.com/russross/blackfriday/issues/329), [List block cannot have a second line #244](https://github.com/russross/blackfriday/issues/244), etc).
Furthermore, its behavior differs from other implementations in some cases, especially regarding lists: [Deep nested lists don't output correctly #329](https://github.com/russross/blackfriday/issues/329), [List block cannot have a second line #244](https://github.com/russross/blackfriday/issues/244), etc.
This behavior sometimes causes problems. If you migrate your markdown text to blackfriday based wikis from Github, many lists will immediately be broken.
This behavior sometimes causes problems. If you migrate your Markdown text from GitHub to blackfriday-based wikis, many lists will immediately be broken.
As mentioned above, CommonMark is too complicated and hard to implement, So Markdown parsers based on CommonMark barely exist.
As mentioned above, CommonMark is complicated and hard to implement, so Markdown parsers based on CommonMark are few and far between.
Features
----------------------
- **Standard compliant.** : goldmark get full compliance with latest CommonMark spec.
- **Extensible.** : Do you want to add a `@username` mention syntax to the markdown?
You can easily do it in goldmark. You can add your AST nodes,
parsers for block level elements, parsers for inline level elements,
transformers for paragraphs, transformers for whole AST structure, and
- **Standards-compliant.** goldmark is fully compliant with the latest [CommonMark](https://commonmark.org/) specification.
- **Extensible.** Do you want to add a `@username` mention syntax to Markdown?
You can easily do so in goldmark. You can add your AST nodes,
parsers for block-level elements, parsers for inline-level elements,
transformers for paragraphs, transformers for the whole AST structure, and
renderers.
- **Preformance.** : goldmark performs pretty much equally to the cmark
(CommonMark reference implementation written in c).
- **Robust** : goldmark is tested with [go-fuzz](https://github.com/dvyukov/go-fuzz), a fuzz testing tool.
- **Builtin extensions.** : goldmark ships with common extensions like tables, strikethrough,
- **Performance.** goldmark's performance is on par with that of cmark,
the CommonMark reference implementation written in C.
- **Robust.** goldmark is tested with `go test --fuzz`.
- **Built-in extensions.** goldmark ships with common extensions like tables, strikethrough,
task lists, and definition lists.
- **Depends only on standard libraries.**
@ -62,15 +64,15 @@ Usage
----------------------
Import packages:
```
```go
import (
"bytes"
"github.com/yuin/goldmark"
"bytes"
"github.com/yuin/goldmark"
)
```
Convert Markdown documents with the CommonMark compliant mode:
Convert Markdown documents with the CommonMark-compliant mode:
```go
var buf bytes.Buffer
@ -84,29 +86,32 @@ With options
```go
var buf bytes.Buffer
if err := goldmark.Convert(source, &buf, parser.WithWorkers(16)); err != nil {
if err := goldmark.Convert(source, &buf, parser.WithContext(ctx)); err != nil {
panic(err)
}
```
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `parser.WithContext` | A parser.Context | Context for the parsing phase. |
| parser.WithWorkers | int | Number of goroutines that execute concurrent inline element parsing. |
| `parser.WithContext` | A `parser.Context` | Context for the parsing phase. |
Context options
----------------------
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `parser.WithIDs` | A `parser.IDs` | `IDs` allows you to change logics that are related to element id(ex: Auto heading id generation). |
`parser.WithWorkers` may make performance better a little if markdown text
is relatively large. Otherwise, `parser.Workers` may cause performance degradation due to
goroutine overheads.
Custom parser and renderer
--------------------------
```go
import (
"bytes"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
"bytes"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
)
md := goldmark.New(
@ -125,6 +130,14 @@ if err := md.Convert(source, &buf); err != nil {
}
```
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `goldmark.WithParser` | `parser.Parser` | This option must be passed before `goldmark.WithParserOptions` and `goldmark.WithExtensions` |
| `goldmark.WithRenderer` | `renderer.Renderer` | This option must be passed before `goldmark.WithRendererOptions` and `goldmark.WithExtensions` |
| `goldmark.WithParserOptions` | `...parser.Option` | |
| `goldmark.WithRendererOptions` | `...renderer.Option` | |
| `goldmark.WithExtensions` | `...goldmark.Extender` | |
Parser and Renderer options
------------------------------
@ -132,9 +145,10 @@ Parser and Renderer options
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `parser.WithBlockParsers` | A `util.PrioritizedSlice` whose elements are `parser.BlockParser` | Parsers for parsing block level elements. |
| `parser.WithInlineParsers` | A `util.PrioritizedSlice` whose elements are `parser.InlineParser` | Parsers for parsing inline level elements. |
| `parser.WithParagraphTransformers` | A `util.PrioritizedSlice` whose elements are `parser.ParagraphTransformer` | Transformers for transforming paragraph nodes. |
| `parser.WithBlockParsers` | A `util.PrioritizedSlice` whose elements are `parser.BlockParser` | Parsers for parsing block level elements. |
| `parser.WithInlineParsers` | A `util.PrioritizedSlice` whose elements are `parser.InlineParser` | Parsers for parsing inline level elements. |
| `parser.WithParagraphTransformers` | A `util.PrioritizedSlice` whose elements are `parser.ParagraphTransformer` | Transformers for transforming paragraph nodes. |
| `parser.WithASTTransformers` | A `util.PrioritizedSlice` whose elements are `parser.ASTTransformer` | Transformers for transforming an AST. |
| `parser.WithAutoHeadingID` | `-` | Enables auto heading ids. |
| `parser.WithAttribute` | `-` | Enables custom attributes. Currently only headings supports attributes. |
@ -143,39 +157,42 @@ Parser and Renderer options
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `html.WithWriter` | `html.Writer` | `html.Writer` for writing contents to an `io.Writer`. |
| `html.WithHardWraps` | `-` | Render new lines as `<br>`.|
| `html.WithHardWraps` | `-` | Render newlines as `<br>`.|
| `html.WithXHTML` | `-` | Render as XHTML. |
| `html.WithUnsafe` | `-` | By default, goldmark does not render raw HTMLs and potentially dangerous links. With this option, goldmark renders these contents as it is. |
| `html.WithUnsafe` | `-` | By default, goldmark does not render raw HTML or potentially dangerous links. With this option, goldmark renders such content as written. |
### Built-in extensions
- `extension.Table`
- [Github Flavored Markdown: Tables](https://github.github.com/gfm/#tables-extension-)
- [GitHub Flavored Markdown: Tables](https://github.github.com/gfm/#tables-extension-)
- `extension.Strikethrough`
- [Github Flavored Markdown: Strikethrough](https://github.github.com/gfm/#strikethrough-extension-)
- [GitHub Flavored Markdown: Strikethrough](https://github.github.com/gfm/#strikethrough-extension-)
- `extension.Linkify`
- [Github Flavored Markdown: Autolinks](https://github.github.com/gfm/#autolinks-extension-)
- [GitHub Flavored Markdown: Autolinks](https://github.github.com/gfm/#autolinks-extension-)
- `extension.TaskList`
- [Github Flavored Markdown: Task list items](https://github.github.com/gfm/#task-list-items-extension-)
- [GitHub Flavored Markdown: Task list items](https://github.github.com/gfm/#task-list-items-extension-)
- `extension.GFM`
- This extension enables Table, Strikethrough, Linkify and TaskList.
- This extension does not filter tags defined in [6.11Disallowed Raw HTML (extension)](https://github.github.com/gfm/#disallowed-raw-html-extension-).
If you need to filter HTML tags, see [Security](#security)
- This extension enables Table, Strikethrough, Linkify and TaskList.
- This extension does not filter tags defined in [6.11: Disallowed Raw HTML (extension)](https://github.github.com/gfm/#disallowed-raw-html-extension-).
If you need to filter HTML tags, see [Security](#security).
- If you need to parse github emojis, you can use [goldmark-emoji](https://github.com/yuin/goldmark-emoji) extension.
- `extension.DefinitionList`
- [PHP Markdown Extra: Definition lists](https://michelf.ca/projects/php-markdown/extra/#def-list)
- [PHP Markdown Extra: Definition lists](https://michelf.ca/projects/php-markdown/extra/#def-list)
- `extension.Footnote`
- [PHP Markdown Extra: Footnotes](https://michelf.ca/projects/php-markdown/extra/#footnotes)
- [PHP Markdown Extra: Footnotes](https://michelf.ca/projects/php-markdown/extra/#footnotes)
- `extension.Typographer`
- This extension substitutes punctuations with typographic entities like [smartypants](https://daringfireball.net/projects/smartypants/).
- This extension substitutes punctuations with typographic entities like [smartypants](https://daringfireball.net/projects/smartypants/).
- `extension.CJK`
- This extension is a shortcut for CJK related functionalities.
### Attributes
`parser.WithAttribute` option allows you to define attributes on some elements.
The `parser.WithAttribute` option allows you to define attributes on some elements.
Currently only headings support attributes.
**Attributes are being discussed in the
[CommonMark forum](https://talk.commonmark.org/t/consistent-attribute-syntax/272).
This syntax possibly changes in the future.**
**Attributes are being discussed in the
[CommonMark forum](https://talk.commonmark.org/t/consistent-attribute-syntax/272).
This syntax may possibly change in the future.**
#### Headings
@ -191,13 +208,25 @@ heading {#id .className attrName=attrValue}
============
```
### Table extension
The Table extension implements [Table(extension)](https://github.github.com/gfm/#tables-extension-), as
defined in [GitHub Flavored Markdown Spec](https://github.github.com/gfm/).
Specs are defined for XHTML, so specs use some deprecated attributes for HTML5.
You can override alignment rendering method via options.
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `extension.WithTableCellAlignMethod` | `extension.TableCellAlignMethod` | Option indicates how are table cells aligned. |
### Typographer extension
Typographer extension translates plain ASCII punctuation characters into typographic punctuation HTML entities.
The Typographer extension translates plain ASCII punctuation characters into typographic-punctuation HTML entities.
Default substitutions are:
| Punctuation | Default entitiy |
| Punctuation | Default entity |
| ------------ | ---------- |
| `'` | `&lsquo;`, `&rsquo;` |
| `"` | `&ldquo;`, `&rdquo;` |
@ -207,25 +236,313 @@ Default substitutions are:
| `<<` | `&laquo;` |
| `>>` | `&raquo;` |
You can overwrite the substitutions by `extensions.WithTypographicSubstitutions`.
You can override the default substitutions via `extensions.WithTypographicSubstitutions`:
```go
markdown := goldmark.New(
goldmark.WithExtensions(
extension.NewTypographer(
extension.WithTypographicSubstitutions(extension.TypographicSubstitutions{
extension.LeftSingleQuote: []byte("&sbquo;"),
extension.RightSingleQuote: nil, // nil disables a substitution
}),
),
),
goldmark.WithExtensions(
extension.NewTypographer(
extension.WithTypographicSubstitutions(extension.TypographicSubstitutions{
extension.LeftSingleQuote: []byte("&sbquo;"),
extension.RightSingleQuote: nil, // nil disables a substitution
}),
),
),
)
```
### Linkify extension
The Linkify extension implements [Autolinks(extension)](https://github.github.com/gfm/#autolinks-extension-), as
defined in [GitHub Flavored Markdown Spec](https://github.github.com/gfm/).
Since the spec does not define details about URLs, there are numerous ambiguous cases.
You can override autolinking patterns via options.
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `extension.WithLinkifyAllowedProtocols` | `[][]byte \| []string` | List of allowed protocols such as `[]string{ "http:" }` |
| `extension.WithLinkifyURLRegexp` | `*regexp.Regexp` | Regexp that defines URLs, including protocols |
| `extension.WithLinkifyWWWRegexp` | `*regexp.Regexp` | Regexp that defines URL starting with `www.`. This pattern corresponds to [the extended www autolink](https://github.github.com/gfm/#extended-www-autolink) |
| `extension.WithLinkifyEmailRegexp` | `*regexp.Regexp` | Regexp that defines email addresses` |
Example, using [xurls](https://github.com/mvdan/xurls):
```go
import "mvdan.cc/xurls/v2"
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
extension.NewLinkify(
extension.WithLinkifyAllowedProtocols([]string{
"http:",
"https:",
}),
extension.WithLinkifyURLRegexp(
xurls.Strict(),
),
),
),
)
```
### Footnotes extension
The Footnote extension implements [PHP Markdown Extra: Footnotes](https://michelf.ca/projects/php-markdown/extra/#footnotes).
This extension has some options:
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `extension.WithFootnoteIDPrefix` | `[]byte \| string` | a prefix for the id attributes.|
| `extension.WithFootnoteIDPrefixFunction` | `func(gast.Node) []byte` | a function that determines the id attribute for given Node.|
| `extension.WithFootnoteLinkTitle` | `[]byte \| string` | an optional title attribute for footnote links.|
| `extension.WithFootnoteBacklinkTitle` | `[]byte \| string` | an optional title attribute for footnote backlinks. |
| `extension.WithFootnoteLinkClass` | `[]byte \| string` | a class for footnote links. This defaults to `footnote-ref`. |
| `extension.WithFootnoteBacklinkClass` | `[]byte \| string` | a class for footnote backlinks. This defaults to `footnote-backref`. |
| `extension.WithFootnoteBacklinkHTML` | `[]byte \| string` | a class for footnote backlinks. This defaults to `&#x21a9;&#xfe0e;`. |
Some options can have special substitutions. Occurrences of “^^” in the string will be replaced by the corresponding footnote number in the HTML output. Occurrences of “%%” will be replaced by a number for the reference (footnotes can have multiple references).
`extension.WithFootnoteIDPrefix` and `extension.WithFootnoteIDPrefixFunction` are useful if you have multiple Markdown documents displayed inside one HTML document to avoid footnote ids to clash each other.
`extension.WithFootnoteIDPrefix` sets fixed id prefix, so you may write codes like the following:
```go
for _, path := range files {
source := readAll(path)
prefix := getPrefix(path)
markdown := goldmark.New(
goldmark.WithExtensions(
NewFootnote(
WithFootnoteIDPrefix(path),
),
),
)
var b bytes.Buffer
err := markdown.Convert(source, &b)
if err != nil {
t.Error(err.Error())
}
}
```
`extension.WithFootnoteIDPrefixFunction` determines an id prefix by calling given function, so you may write codes like the following:
```go
markdown := goldmark.New(
goldmark.WithExtensions(
NewFootnote(
WithFootnoteIDPrefixFunction(func(n gast.Node) []byte {
v, ok := n.OwnerDocument().Meta()["footnote-prefix"]
if ok {
return util.StringToReadOnlyBytes(v.(string))
}
return nil
}),
),
),
)
for _, path := range files {
source := readAll(path)
var b bytes.Buffer
doc := markdown.Parser().Parse(text.NewReader(source))
doc.Meta()["footnote-prefix"] = getPrefix(path)
err := markdown.Renderer().Render(&b, source, doc)
}
```
You can use [goldmark-meta](https://github.com/yuin/goldmark-meta) to define a id prefix in the markdown document:
Create extensions
```markdown
---
title: document title
slug: article1
footnote-prefix: article1
---
# My article
```
### CJK extension
CommonMark gives compatibilities a high priority and original markdown was designed by westerners. So CommonMark lacks considerations for languages like CJK.
This extension provides additional options for CJK users.
| Functional option | Type | Description |
| ----------------- | ---- | ----------- |
| `extension.WithEastAsianLineBreaks` | `...extension.EastAsianLineBreaksStyle` | Soft line breaks are rendered as a newline. Some asian users will see it as an unnecessary space. With this option, soft line breaks between east asian wide characters will be ignored. This defaults to `EastAsianLineBreaksStyleSimple`. |
| `extension.WithEscapedSpace` | `-` | Without spaces around an emphasis started with east asian punctuations, it is not interpreted as an emphasis(as defined in CommonMark spec). With this option, you can avoid this inconvenient behavior by putting 'not rendered' spaces around an emphasis like `太郎は\ **「こんにちわ」**\ といった`. |
#### Styles of Line Breaking
| Style | Description |
| ----- | ----------- |
| `EastAsianLineBreaksStyleSimple` | Soft line breaks are ignored if both sides of the break are east asian wide character. This behavior is the same as [`east_asian_line_breaks`](https://pandoc.org/MANUAL.html#extension-east_asian_line_breaks) in Pandoc. |
| `EastAsianLineBreaksCSS3Draft` | This option implements CSS text level3 [Segment Break Transformation Rules](https://drafts.csswg.org/css-text-3/#line-break-transform) with [some enhancements](https://github.com/w3c/csswg-drafts/issues/5086). |
#### Example of `EastAsianLineBreaksStyleSimple`
Input Markdown:
```md
私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。
```
Output:
```html
<p>私はプログラマーです。東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。</p>
```
#### Example of `EastAsianLineBreaksCSS3Draft`
Input Markdown:
```md
私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。
```
Output:
```html
<p>私はプログラマーです。東京の会社に勤めています。GoでWebアプリケーションを開発しています。</p>
```
Security
--------------------
By default, goldmark does not render raw HTML or potentially-dangerous URLs.
If you need to gain more control over untrusted contents, it is recommended that you
use an HTML sanitizer such as [bluemonday](https://github.com/microcosm-cc/bluemonday).
Benchmark
--------------------
You can run this benchmark in the `_benchmark` directory.
### against other golang libraries
blackfriday v2 seems to be the fastest, but as it is not CommonMark compliant, its performance cannot be directly compared to that of the CommonMark-compliant libraries.
goldmark, meanwhile, builds a clean, extensible AST structure, achieves full compliance with
CommonMark, and consumes less memory, all while being reasonably fast.
- MBP 2019 13″(i5, 16GB), Go1.17
```
BenchmarkMarkdown/Blackfriday-v2-8 302 3743747 ns/op 3290445 B/op 20050 allocs/op
BenchmarkMarkdown/GoldMark-8 280 4200974 ns/op 2559738 B/op 13435 allocs/op
BenchmarkMarkdown/CommonMark-8 226 5283686 ns/op 2702490 B/op 20792 allocs/op
BenchmarkMarkdown/Lute-8 12 92652857 ns/op 10602649 B/op 40555 allocs/op
BenchmarkMarkdown/GoMarkdown-8 13 81380167 ns/op 2245002 B/op 22889 allocs/op
```
### against cmark (CommonMark reference implementation written in C)
- MBP 2019 13″(i5, 16GB), Go1.17
```
----------- cmark -----------
file: _data.md
iteration: 50
average: 0.0044073057 sec
------- goldmark -------
file: _data.md
iteration: 50
average: 0.0041611990 sec
```
As you can see, goldmark's performance is on par with cmark's.
Extensions
--------------------
### List of extensions
- [goldmark-meta](https://github.com/yuin/goldmark-meta): A YAML metadata
extension for the goldmark Markdown parser.
- [goldmark-highlighting](https://github.com/yuin/goldmark-highlighting): A syntax-highlighting extension
for the goldmark markdown parser.
- [goldmark-emoji](https://github.com/yuin/goldmark-emoji): An emoji
extension for the goldmark Markdown parser.
- [goldmark-mathjax](https://github.com/litao91/goldmark-mathjax): Mathjax support for the goldmark markdown parser
- [goldmark-pdf](https://github.com/stephenafamo/goldmark-pdf): A PDF renderer that can be passed to `goldmark.WithRenderer()`.
- [goldmark-hashtag](https://github.com/abhinav/goldmark-hashtag): Adds support for `#hashtag`-based tagging to goldmark.
- [goldmark-wikilink](https://github.com/abhinav/goldmark-wikilink): Adds support for `[[wiki]]`-style links to goldmark.
- [goldmark-anchor](https://github.com/abhinav/goldmark-anchor): Adds anchors (permalinks) next to all headers in a document.
- [goldmark-figure](https://github.com/mangoumbrella/goldmark-figure): Adds support for rendering paragraphs starting with an image to `<figure>` elements.
- [goldmark-frontmatter](https://github.com/abhinav/goldmark-frontmatter): Adds support for YAML, TOML, and custom front matter to documents.
- [goldmark-toc](https://github.com/abhinav/goldmark-toc): Adds support for generating tables-of-contents for goldmark documents.
- [goldmark-mermaid](https://github.com/abhinav/goldmark-mermaid): Adds support for rendering [Mermaid](https://mermaid-js.github.io/mermaid/) diagrams in goldmark documents.
- [goldmark-pikchr](https://github.com/jchenry/goldmark-pikchr): Adds support for rendering [Pikchr](https://pikchr.org/home/doc/trunk/homepage.md) diagrams in goldmark documents.
- [goldmark-embed](https://github.com/13rac1/goldmark-embed): Adds support for rendering embeds from YouTube links.
- [goldmark-latex](https://github.com/soypat/goldmark-latex): A $\LaTeX$ renderer that can be passed to `goldmark.WithRenderer()`.
- [goldmark-fences](https://github.com/stefanfritsch/goldmark-fences): Support for pandoc-style [fenced divs](https://pandoc.org/MANUAL.html#divs-and-spans) in goldmark.
- [goldmark-d2](https://github.com/FurqanSoftware/goldmark-d2): Adds support for [D2](https://d2lang.com/) diagrams.
- [goldmark-katex](https://github.com/FurqanSoftware/goldmark-katex): Adds support for [KaTeX](https://katex.org/) math and equations.
- [goldmark-img64](https://github.com/tenkoh/goldmark-img64): Adds support for embedding images into the document as DataURL (base64 encoded).
- [goldmark-enclave](https://github.com/quailyquaily/goldmark-enclave): Adds support for embedding youtube/bilibili video, X's [oembed X](https://publish.x.com/), [tradingview chart](https://www.tradingview.com/widget/)'s chart, [quaily widget](https://quaily.com), [spotify embeds](https://developer.spotify.com/documentation/embeds), [dify embed](https://dify.ai/) and html audio into the document.
- [goldmark-wiki-table](https://github.com/movsb/goldmark-wiki-table): Adds support for embedding Wiki Tables.
- [goldmark-tgmd](https://github.com/Mad-Pixels/goldmark-tgmd): A Telegram markdown renderer that can be passed to `goldmark.WithRenderer()`.
### Loading extensions at runtime
[goldmark-dynamic](https://github.com/yuin/goldmark-dynamic) allows you to write a goldmark extension in Lua and load it at runtime without re-compilation.
Please refer to [goldmark-dynamic](https://github.com/yuin/goldmark-dynamic) for details.
goldmark internal(for extension developers)
----------------------------------------------
### Overview
goldmark's Markdown processing is outlined in the diagram below.
```
<Markdown in []byte, parser.Context>
|
V
+-------- parser.Parser ---------------------------
| 1. Parse block elements into AST
| 1. If a parsed block is a paragraph, apply
| ast.ParagraphTransformer
| 2. Traverse AST and parse blocks.
| 1. Process delimiters(emphasis) at the end of
| block parsing
| 3. Apply parser.ASTTransformers to AST
|
V
<ast.Node>
|
V
+------- renderer.Renderer ------------------------
| 1. Traverse AST and apply renderer.NodeRenderer
| corespond to the node type
|
V
<Output>
```
### Parsing
Markdown documents are read through `text.Reader` interface.
AST nodes do not have concrete text. AST nodes have segment information of the documents, represented by `text.Segment` .
`text.Segment` has 3 attributes: `Start`, `End`, `Padding` .
(TBC)
**TODO**
See `extension` directory for examples of extensions.
@ -237,63 +554,6 @@ Summary:
3. Write a renderer that implements `renderer.NodeRenderer`.
4. Define your goldmark extension that implements `goldmark.Extender`.
Security
--------------------
By default, goldmark does not render raw HTMLs and potentially dangerous urls.
If you need to gain more control over untrusted contents, it is recommended to
use HTML sanitizer such as [bluemonday](https://github.com/microcosm-cc/bluemonday).
Benchmark
--------------------
You can run this benchmark in the `_benchmark` directory.
### against other golang libraries
blackfriday v2 seems fastest, but it is not CommonMark compiliant so performance of the
blackfriday v2 can not simply be compared with other Commonmark compliant libraries.
Though goldmark builds clean extensible AST structure and get full compliance with
Commonmark, it is resonably fast and less memory consumption.
This benchmark parses a relatively large markdown text. In such text, concurrent parsing
makes performance better a little.
```
BenchmarkMarkdown/Blackfriday-v2-4 300 5316935 ns/op 3321072 B/op 20050 allocs/op
BenchmarkMarkdown/GoldMark(workers=16)-4 300 5506219 ns/op 2702358 B/op 14494 allocs/op
BenchmarkMarkdown/GoldMark-4 200 5903779 ns/op 2594304 B/op 13861 allocs/op
BenchmarkMarkdown/CommonMark-4 200 7147659 ns/op 2752977 B/op 18827 allocs/op
BenchmarkMarkdown/Lute-4 200 5930621 ns/op 2839712 B/op 21165 allocs/op
BenchmarkMarkdown/GoMarkdown-4 10 120953070 ns/op 2192278 B/op 22174 allocs/op
```
### against cmark(A CommonMark reference implementation written in c)
```
----------- cmark -----------
file: _data.md
iteration: 50
average: 0.0047014618 sec
------- goldmark -------
file: _data.md
iteration: 50
average: 0.0052624750 sec
------- goldmark(workers=16) -------
file: _data.md
iteration: 50
average: 0.0044918780 sec
```
As you can see, goldmark performs pretty much equally to the cmark.
Extensions
--------------------
- [goldmark-meta](https://github.com/yuin/goldmark-meta) : A YAML metadata
extension for the goldmark markdown parser.
- [goldmark-highlighting](https://github.com/yuin/goldmark-highlighting) : A Syntax highlighting extension
for the goldmark markdown parser.
- [goldmark-mathjax](https://github.com/litao91/goldmark-mathjax) : Mathjax support for goldmark markdown parser
Donation
--------------------

View file

@ -4,18 +4,39 @@ ifeq ($(OS),Windows_NT)
CMARK_BIN=cmark_benchmark.exe
CMARK_RUN=bash -c "PATH=./cmark-master/build/src:$${PATH} ./$(CMARK_BIN)"
endif
ifneq ($(WSL_INTEROP),)
CMARK_BIN=cmark_benchmark.exe
CMARK_RUN=cp ./cmark-master/build-mingw/windows/bin/libcmark.dll . && ./$(CMARK_BIN); rm -f libcmark.dll
endif
.PHONY: run
run: $(CMARK_BIN)
$(CMARK_RUN)
go run ./goldmark_benchmark.go
@ $(CMARK_RUN)
@ if [ -z "$${WSL_INTEROP}" ]; then \
go run ./goldmark_benchmark.go; \
else \
GOOS=windows GOARCH=amd64 go build -o goldmark_benchmark.exe ./goldmark_benchmark.go && ./goldmark_benchmark.exe; \
fi
./cmark-master/build/src/config.h:
./cmark-master/Makefile:
wget -nc -O cmark.zip https://github.com/commonmark/cmark/archive/master.zip
unzip cmark.zip
rm -f cmark.zip
cd cmark-master && make
@ if [ -z "$${WSL_INTEROP}" ]; then \
cd cmark-master && make; \
else \
cd cmark-master && make mingw; \
fi
$(CMARK_BIN): ./cmark-master/build/src/config.h
gcc -I./cmark-master/build/src -I./cmark-master/src cmark_benchmark.c -o $(CMARK_BIN) -L./cmark-master/build/src -lcmark
$(CMARK_BIN): ./cmark-master/Makefile
@ if [ -z "$${WSL_INTEROP}" ]; then \
gcc -I./cmark-master/build/src -I./cmark-master/src cmark_benchmark.c -o $(CMARK_BIN) -L./cmark-master/build/src -lcmark; \
else \
i686-w64-mingw32-gcc -I./cmark-master/build-mingw/windows/include cmark_benchmark.c -o $(CMARK_BIN) -L./cmark-master/build-mingw/windows/lib -lcmark.dll; \
fi
.PHONY: clean
clean:
rm -f $(CMARK_BIN)
rm -f goldmark_benchmark.exe

View file

@ -9,7 +9,6 @@ import (
"time"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
)
@ -43,18 +42,4 @@ func main() {
fmt.Printf("file: %s\n", file)
fmt.Printf("iteration: %d\n", n)
fmt.Printf("average: %.10f sec\n", float64((int64(sum)/int64(n)))/1000000000.0)
sum = time.Duration(0)
for i := 0; i < n; i++ {
start := time.Now()
out.Reset()
if err := markdown.Convert(source, &out, parser.WithWorkers(16)); err != nil {
panic(err)
}
sum += time.Since(start)
}
fmt.Printf("------- goldmark(workers=16) -------\n")
fmt.Printf("file: %s\n", file)
fmt.Printf("iteration: %d\n", n)
fmt.Printf("average: %.10f sec\n", float64((int64(sum)/int64(n)))/1000000000.0)
}

View file

@ -1,4 +1,4 @@
package main
package benchmark
import (
"bytes"
@ -7,53 +7,31 @@ import (
gomarkdown "github.com/gomarkdown/markdown"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/util"
"gitlab.com/golang-commonmark/markdown"
bf1 "github.com/russross/blackfriday"
bf2 "gopkg.in/russross/blackfriday.v2"
"github.com/russross/blackfriday/v2"
"github.com/b3log/lute"
"github.com/88250/lute"
)
func BenchmarkMarkdown(b *testing.B) {
b.Run("Blackfriday-v1", func(b *testing.B) {
r := func(src []byte) ([]byte, error) {
out := bf1.MarkdownBasic(src)
return out, nil
}
doBenchmark(b, r)
})
b.Run("Blackfriday-v2", func(b *testing.B) {
r := func(src []byte) ([]byte, error) {
out := bf2.Run(src)
out := blackfriday.Run(src)
return out, nil
}
doBenchmark(b, r)
})
b.Run("GoldMark(workers=16)", func(b *testing.B) {
markdown := goldmark.New(
goldmark.WithRendererOptions(html.WithXHTML(), html.WithUnsafe()),
)
r := func(src []byte) ([]byte, error) {
var out bytes.Buffer
err := markdown.Convert(src, &out, parser.WithWorkers(16))
return out.Bytes(), err
}
doBenchmark(b, r)
})
b.Run("GoldMark", func(b *testing.B) {
markdown := goldmark.New(
goldmark.WithRendererOptions(html.WithXHTML(), html.WithUnsafe()),
)
r := func(src []byte) ([]byte, error) {
var out bytes.Buffer
err := markdown.Convert(src, &out, parser.WithWorkers(0))
err := markdown.Convert(src, &out)
return out.Bytes(), err
}
doBenchmark(b, r)
@ -70,15 +48,18 @@ func BenchmarkMarkdown(b *testing.B) {
})
b.Run("Lute", func(b *testing.B) {
luteEngine := lute.New(
lute.GFM(false),
lute.CodeSyntaxHighlight(false),
lute.SoftBreak2HardBreak(false),
lute.AutoSpace(false),
lute.FixTermTypo(false))
luteEngine := lute.New()
luteEngine.SetGFMAutoLink(false)
luteEngine.SetGFMStrikethrough(false)
luteEngine.SetGFMTable(false)
luteEngine.SetGFMTaskListItem(false)
luteEngine.SetCodeSyntaxHighlight(false)
luteEngine.SetSoftBreak2HardBreak(false)
luteEngine.SetAutoSpace(false)
luteEngine.SetFixTermTypo(false)
r := func(src []byte) ([]byte, error) {
out, err := luteEngine.FormatStr("Benchmark", util.BytesToReadOnlyString(src))
return util.StringToReadOnlyBytes(out), err
out := luteEngine.MarkdownStr("Benchmark", util.BytesToReadOnlyString(src))
return util.StringToReadOnlyBytes(out), nil
}
doBenchmark(b, r)
})

25
_benchmark/go/go.mod Normal file
View file

@ -0,0 +1,25 @@
module banchmark
go 1.17
require (
github.com/88250/lute v1.7.5
github.com/gomarkdown/markdown v0.0.0-20230322041520-c84983bdbf2a
github.com/russross/blackfriday/v2 v2.1.0
github.com/yuin/goldmark v0.0.0
gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a
)
require (
github.com/alecthomas/chroma v0.10.0 // indirect
github.com/dlclark/regexp2 v1.10.0 // indirect
github.com/gopherjs/gopherjs v1.17.2 // indirect
gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181 // indirect
gitlab.com/golang-commonmark/linkify v0.0.0-20191026162114-a0c2df6c8f82 // indirect
gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84 // indirect
gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f // indirect
golang.org/x/text v0.10.0 // indirect
)
replace gopkg.in/russross/blackfriday.v2 v2.0.1 => github.com/russross/blackfriday/v2 v2.0.1
replace github.com/yuin/goldmark v0.0.0 => ../../

42
_benchmark/go/go.sum Normal file
View file

@ -0,0 +1,42 @@
github.com/88250/lute v1.7.5 h1:mcPFURh5sK1WH1kFRjqK5DkMWOfVN2BhyrXitN8GmpQ=
github.com/88250/lute v1.7.5/go.mod h1:cEoBGi0zArPqAsp0MdG9SKinvH/xxZZWXU7sRx8vHSA=
github.com/alecthomas/chroma v0.10.0 h1:7XDcGkCQopCNKjZHfYrNLraA+M7e0fMiJ/Mfikbfjek=
github.com/alecthomas/chroma v0.10.0/go.mod h1:jtJATyUxlIORhUOFNA9NZDWGAQ8wpxQQqNSB4rjA/1s=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/dlclark/regexp2 v1.4.0/go.mod h1:2pZnwuY/m+8K6iRw6wQdMtk+rH5tNGR1i55kozfMjCc=
github.com/dlclark/regexp2 v1.10.0 h1:+/GIL799phkJqYW+3YbOd8LCcbHzT0Pbo8zl70MHsq0=
github.com/dlclark/regexp2 v1.10.0/go.mod h1:DHkYz0B9wPfa6wondMfaivmHpzrQ3v9q8cnmRbL6yW8=
github.com/gomarkdown/markdown v0.0.0-20230322041520-c84983bdbf2a h1:AWZzzFrqyjYlRloN6edwTLTUbKxf5flLXNuTBDm3Ews=
github.com/gomarkdown/markdown v0.0.0-20230322041520-c84983bdbf2a/go.mod h1:JDGcbDT52eL4fju3sZ4TeHGsQwhG9nbDV21aMyhwPoA=
github.com/gopherjs/gopherjs v1.17.2 h1:fQnZVsXk8uxXIStYb0N4bGk7jeyTalG/wsZjQ25dO0g=
github.com/gopherjs/gopherjs v1.17.2/go.mod h1:pRRIvn/QzFLrKfvEz3qUuEhtE/zLCWfreZ6J5gM2i+k=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/testify v1.7.0 h1:nwc3DEeHmmLAfoZucVR881uASk0Mfjw8xYJ99tb5CcY=
github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/yuin/goldmark v1.2.1 h1:ruQGxdhGHe7FWOJPT0mKs5+pD2Xs1Bm/kdGlHO04FmM=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181 h1:K+bMSIx9A7mLES1rtG+qKduLIXq40DAzYHtb0XuCukA=
gitlab.com/golang-commonmark/html v0.0.0-20191124015941-a22733972181/go.mod h1:dzYhVIwWCtzPAa4QP98wfB9+mzt33MSmM8wsKiMi2ow=
gitlab.com/golang-commonmark/linkify v0.0.0-20191026162114-a0c2df6c8f82 h1:oYrL81N608MLZhma3ruL8qTM4xcpYECGut8KSxRY59g=
gitlab.com/golang-commonmark/linkify v0.0.0-20191026162114-a0c2df6c8f82/go.mod h1:Gn+LZmCrhPECMD3SOKlE+BOHwhOYD9j7WT9NUtkCrC8=
gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a h1:O85GKETcmnCNAfv4Aym9tepU8OE0NmcZNqPlXcsBKBs=
gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a/go.mod h1:LaSIs30YPGs1H5jwGgPhLzc8vkNc/k0rDX/fEZqiU/M=
gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84 h1:qqjvoVXdWIcZCLPMlzgA7P9FZWdPGPvP/l3ef8GzV6o=
gitlab.com/golang-commonmark/mdurl v0.0.0-20191124015652-932350d1cb84/go.mod h1:IJZ+fdMvbW2qW6htJx7sLJ04FEs4Ldl/MDsJtMKywfw=
gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f h1:Wku8eEdeJqIOFHtrfkYUByc4bCaTeA6fL0UJgfEiFMI=
gitlab.com/golang-commonmark/puny v0.0.0-20191124015043-9f83538fa04f/go.mod h1:Tiuhl+njh/JIg0uS/sOJVYi0x2HEa5rc1OAaVsb5tAs=
gitlab.com/opennota/wd v0.0.0-20180912061657-c5d65f63c638 h1:uPZaMiz6Sz0PZs3IZJWpU5qHKGNy///1pacZC9txiUI=
gitlab.com/opennota/wd v0.0.0-20180912061657-c5d65f63c638/go.mod h1:EGRJaqe2eO9XGmFtQCvV3Lm9NLico3UhFwUpCG/+mVU=
golang.org/x/text v0.3.2/go.mod h1:bEr9sfX3Q8Zfm5fL9x+3itogRgK3+ptLWKqgva+5dAk=
golang.org/x/text v0.10.0 h1:UpjohKhiEgNc0CSauXmwYftY1+LlaC75SJwh0SgCX58=
golang.org/x/text v0.10.0/go.mod h1:TvPlkZtksWOMsz7fbANvkp4WM8x/WCo/om8BMLbz+aE=
golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c h1:dUUwHk2QECo/6vqA44rthZ8ie2QXMNeKRTHCNY2nXvo=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=

View file

@ -8,3 +8,838 @@
B</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
2
//- - - - - - - - -//
**test**\
test**test**\
**test**test\
test**test**
//- - - - - - - - -//
<p><strong>test</strong><br />
test<strong>test</strong><br />
<strong>test</strong>test<br />
test<strong>test</strong></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
3
//- - - - - - - - -//
>* >
> 1
> 2
>3
//- - - - - - - - -//
<blockquote>
<ul>
<li>
<blockquote>
</blockquote>
</li>
</ul>
<p>1
2
3</p>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//
4
//- - - - - - - - -//
`test`a`test`
//- - - - - - - - -//
<p><code>test</code>a<code>test</code></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
5
//- - - - - - - - -//
_**TL/DR** - [Go see summary.](#my-summary-area)_
//- - - - - - - - -//
<p><em><strong>TL/DR</strong> - <a href="#my-summary-area">Go see summary.</a></em></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
6
//- - - - - - - - -//
[This link won't be rendered
correctly](https://geeksocket.in/some-long-link-here "This is the
place where everything breaks")
//- - - - - - - - -//
<p><a href="https://geeksocket.in/some-long-link-here" title="This is the
place where everything breaks">This link won't be rendered
correctly</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
7
//- - - - - - - - -//
[](./target.md)
//- - - - - - - - -//
<p><a href="./target.md"></a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
8
//- - - - - - - - -//
[]()
//- - - - - - - - -//
<p><a href=""></a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
9
//- - - - - - - - -//
[daß] is the old german spelling of [dass]
[daß]: www.das-dass.de
//- - - - - - - - -//
<p><a href="www.das-dass.de">daß</a> is the old german spelling of <a href="www.das-dass.de">dass</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
10
//- - - - - - - - -//
1. First step.
~~~
aaa
---
bbb
~~~
2. few other steps.
//- - - - - - - - -//
<ol>
<li>
<p>First step.</p>
<pre><code>aaa
---
bbb
</code></pre>
</li>
<li>
<p>few other steps.</p>
</li>
</ol>
//= = = = = = = = = = = = = = = = = = = = = = = =//
11: delimiters between ascii punctuations should be parsed
//- - - - - - - - -//
`{%`_name_`%}`
//- - - - - - - - -//
<p><code>{%</code><em>name</em><code>%}</code></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
12: the alt attribute of img should be escaped
//- - - - - - - - -//
!["](quot.jpg)
!['](apos.jpg)
![<](lt.jpg)
![>](gt.jpg)
![&](amp.jpg)
//- - - - - - - - -//
<p><img src="quot.jpg" alt="&quot;" />
<img src="apos.jpg" alt="'" />
<img src="lt.jpg" alt="&lt;" />
<img src="gt.jpg" alt="&gt;" />
<img src="amp.jpg" alt="&amp;" /></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
13: fenced code block starting with tab inside list
//- - - - - - - - -//
* foo
```Makefile
foo
foo
```
//- - - - - - - - -//
<ul>
<li>foo
<pre><code class="language-Makefile">foo
foo
</code></pre>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
14: fenced code block inside list, mismatched tab start
//- - - - - - - - -//
* foo
```Makefile
foo
foo
```
//- - - - - - - - -//
<ul>
<li>foo
<pre><code class="language-Makefile">foo
foo
</code></pre>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
15: fenced code block inside nested list
//- - - - - - - - -//
* foo
- bar
```Makefile
foo
foo
```
//- - - - - - - - -//
<ul>
<li>foo
<ul>
<li>bar
<pre><code class="language-Makefile">foo
foo
</code></pre>
</li>
</ul>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
16: indented code block starting with a tab.
//- - - - - - - - -//
* foo
foo
foo
//- - - - - - - - -//
<ul>
<li>
<p>foo</p>
<pre><code>foo
foo
</code></pre>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
17: fenced code block in list, empty line, spaces on start
//- - - - - - - - -//
* foo
```Makefile
foo
foo
```
//- - - - - - - - -//
<ul>
<li>foo
<pre><code class="language-Makefile">foo
foo
</code></pre>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
18: fenced code block in list, empty line, no spaces on start
//- - - - - - - - -//
* foo
```Makefile
foo
foo
```
//- - - - - - - - -//
<ul>
<li>foo
<pre><code class="language-Makefile">foo
foo
</code></pre>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
19: fenced code block inside nested list, empty line, spaces on start
//- - - - - - - - -//
* foo
- bar
```Makefile
foo
foo
```
//- - - - - - - - -//
<ul>
<li>foo
<ul>
<li>bar
<pre><code class="language-Makefile">foo
foo
</code></pre>
</li>
</ul>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
20: fenced code block inside nested list, empty line, no space on start
//- - - - - - - - -//
* foo
- bar
```Makefile
foo
foo
```
//- - - - - - - - -//
<ul>
<li>foo
<ul>
<li>bar
<pre><code class="language-Makefile">foo
foo
</code></pre>
</li>
</ul>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
21: Fenced code block within list can start with tab
//- - - - - - - - -//
- List
```
A
B
C
```
//- - - - - - - - -//
<ul>
<li>
<p>List</p>
<pre><code>A
B
C
</code></pre>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
22: Indented code block within list can start with tab
//- - - - - - - - -//
- List
A
B
C
a
//- - - - - - - - -//
<ul>
<li>
<p>List</p>
<pre><code>A
B
C
</code></pre>
</li>
</ul>
<p>a</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
23: Emphasis corner case(yuin/goldmark#245)
//- - - - - - - - -//
a* b c d *e*
//- - - - - - - - -//
<p>a* b c d <em>e</em></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
24: HTML block tags can contain trailing spaces
//- - - - - - - - -//
<aaa >
//- - - - - - - - -//
<aaa >
//= = = = = = = = = = = = = = = = = = = = = = = =//
25: Indented code blocks can start with tab
//- - - - - - - - -//
x
//- - - - - - - - -//
<pre><code> x
</code></pre>
//= = = = = = = = = = = = = = = = = = = = = = = =//
26: NUL bytes must be replaced with U+FFFD
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
hello\x00world
<?\x00
//- - - - - - - - -//
<p>hello\ufffdworld</p>
<?\uFFFD
//= = = = = = = = = = = = = = = = = = = = = = = =//
27: Newlines in code spans must be preserved as a space
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
`\n`
`x\n`
`\nx`
//- - - - - - - - -//
<p><code> </code></p>
<p><code>x </code></p>
<p><code> x</code></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
28: Single # is a heading level 1
//- - - - - - - - -//
#
//- - - - - - - - -//
<h1></h1>
//= = = = = = = = = = = = = = = = = = = = = = = =//
29: An empty list item cannot interrupt a paragraph
//- - - - - - - - -//
x
*
//- - - - - - - - -//
<p>x
*</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
30: A link reference definition followed by a single quote without closer
//- - - - - - - - -//
[x]
[x]: <>
'
//- - - - - - - - -//
<p><a href="">x</a></p>
<p>'</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
31: A link reference definition followed by a double quote without closer
//- - - - - - - - -//
[x]
[x]: <>
"
//- - - - - - - - -//
<p><a href="">x</a></p>
<p>&quot;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
32: Hex character entities must be limited to 6 characters
//- - - - - - - - -//
&#x0000041;
//- - - - - - - - -//
<p>&amp;#x0000041;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
33: \x01 should be escaped all the time
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
[x](\x01)
//- - - - - - - - -//
<p><a href="%01">x</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
34: A form feed should not be treated as a space
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
x \f
//- - - - - - - - -//
<p>x \f</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
35: A link reference definition can contain a new line
//- - - - - - - - -//
This is a [test][foo
bar] 1...2..3...
[foo bar]: /
//- - - - - - - - -//
<p>This is a <a href="/">test</a> 1...2..3...</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
36: Emphasis and links
//- - - - - - - - -//
_a[b_c_](d)
//- - - - - - - - -//
<p>_a<a href="d">b_c_</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
37: Tabs and spaces
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
\t\t x\n
//- - - - - - - - -//
<pre><code>\t x\n</code></pre>
//= = = = = = = = = = = = = = = = = = = = = = = =//
38: Decimal HTML entity literals should allow 7 digits
//- - - - - - - - -//
&#7654321;
//- - - - - - - - -//
<p>\uFFFD</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
39: Decimal HTML entities should not be interpreted as octal when starting with a 0
//- - - - - - - - -//
&#0100;
//- - - - - - - - -//
<p>d</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
40: Invalid HTML tag names
//- - - - - - - - -//
<1>
<a:>
<a\f>
< p>
//- - - - - - - - -//
<p>&lt;1&gt;</p>
<p>&lt;a:&gt;</p>
<p>&lt;a\f&gt;</p>
<p>&lt; p&gt;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
41: Link references can not contain spaces after link label
//- - - - - - - - -//
[x]
:>
[o] :x
//- - - - - - - - -//
<p>[x]
:&gt;</p>
<p>[o] :x</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
42: Unclosed link reference titles can interrupt link references
//- - - - - - - - -//
[r]:
<>
'
[o]:
x
'
//- - - - - - - - -//
<p>'</p>
<p>'</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
43: A link containing an image containing a link should disable the outer link
//- - - - - - - - -//
[ ![ [b](c) ](x) ](y)
//- - - - - - - - -//
<p>[ <img src="x" alt=" b " /> ](y)</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
44: An empty list item(with trailing spaces) cannot interrupt a paragraph
//- - - - - - - - -//
a
*
//- - - - - - - - -//
<p>a
*</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
45: Multiple empty list items
//- - - - - - - - -//
-
-
//- - - - - - - - -//
<ul>
<li></li>
<li></li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
46: Vertical tab should not be treated as spaces
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
\v
//- - - - - - - - -//
<p>\v</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
47: Escape back slashes should not be treated as hard line breaks
//- - - - - - - - -//
\\\\
a
//- - - - - - - - -//
<p>\
a</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
48: Multiple paragraphs in tight list
//- - - - - - - - -//
- a
>
b
//- - - - - - - - -//
<ul>
<li>a
<blockquote>
</blockquote>
b</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
49: A list item that is indented up to 3 spaces after an empty list item
//- - - - - - - - -//
1.
1. b
-
- b
//- - - - - - - - -//
<ol>
<li></li>
<li>
<p>b</p>
</li>
</ol>
<ul>
<li></li>
<li>
<p>b</p>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
50: Spaces before a visible hard linebreak should be preserved
//- - - - - - - - -//
a \
b
//- - - - - - - - -//
<p>a <br />
b</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
51: Empty line in a fenced code block under list items
//- - - - - - - - -//
* This is a list item
```
This is a test
This line will be dropped.
This line will be displayed.
```
//- - - - - - - - -//
<ul>
<li>This is a list item
<pre><code>This is a test
This line will be dropped.
This line will be displayed.
</code></pre>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
52: windows-style newline and HTMLs
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
<a \r\nhref='link'>link</a>
<video autoplay muted loop>\r\n<source src=\"https://example.com/example.mp4\" type=\"video/mp4\">\r\nYour browser does not support the video tag.\r\n</video>
//- - - - - - - - -//
<p><a \r\nhref='link'>link</a></p>
<video autoplay muted loop>\r\n<source src=\"https://example.com/example.mp4\" type=\"video/mp4\">\r\nYour browser does not support the video tag.\r\n</video>
//= = = = = = = = = = = = = = = = = = = = = = = =//
53: HTML comment without trailing new lines
OPTIONS: {"trim": true}
//- - - - - - - - -//
<!--
-->
//- - - - - - - - -//
<!--
-->
//= = = = = = = = = = = = = = = = = = = = = = = =//
54: Escaped characters followed by a null character
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
\\\x00\"
//- - - - - - - - -//
<p>\\\ufffd&quot;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
55: inline HTML comment
//- - - - - - - - -//
a <!-- b --> c
a <!-- b -->
//- - - - - - - - -//
<p>a <!-- b --> c</p>
<p>a <!-- b --></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
56: An empty list followed by blockquote
//- - - - - - - - -//
1.
> This is a quote.
//- - - - - - - - -//
<ol>
<li></li>
</ol>
<blockquote>
<p>This is a quote.</p>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//
57: Tabbed fenced code block within a list
//- - - - - - - - -//
1.
```
```
//- - - - - - - - -//
<ol>
<li>
<pre><code></code></pre>
</li>
</ol>
//= = = = = = = = = = = = = = = = = = = = = = = =//
58: HTML end tag without trailing new lines
OPTIONS: {"trim": true}
//- - - - - - - - -//
<pre>
</pre>
//- - - - - - - - -//
<pre>
</pre>
//= = = = = = = = = = = = = = = = = = = = = = = =//
59: Raw HTML tag with one new line
//- - - - - - - - -//
<img src=./.assets/logo.svg
/>
//- - - - - - - - -//
<p><img src=./.assets/logo.svg
/></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
60: Raw HTML tag with multiple new lines
//- - - - - - - - -//
<img src=./.assets/logo.svg
/>
//- - - - - - - - -//
<p>&lt;img src=./.assets/logo.svg</p>
<p>/&gt;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
61: Image alt with a new line
//- - - - - - - - -//
![alt
text](logo.png)
//- - - - - - - - -//
<p><img src="logo.png" alt="alt
text" /></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
62: Image alt with an escaped character
//- - - - - - - - -//
![\`alt](https://example.com/img.png)
//- - - - - - - - -//
<p><img src="https://example.com/img.png" alt="`alt" /></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
63: Emphasis in link label
//- - - - - - - - -//
[*[a]*](b)
//- - - - - - - - -//
<p><a href="b"><em>[a]</em></a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
64: Nested list under an empty list item
//- - - - - - - - -//
-
- foo
//- - - - - - - - -//
<ul>
<li>
<ul>
<li>foo</li>
</ul>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
65: Nested fenced code block with tab
//- - - - - - - - -//
> ```
> 0
> ```
//- - - - - - - - -//
<blockquote>
<pre><code> 0
</code></pre>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//
66: EOF should be rendered as a newline with an unclosed block(w/ TAB)
//- - - - - - - - -//
> ```
> 0
//- - - - - - - - -//
<blockquote>
<pre><code> 0
</code></pre>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//
67: EOF should be rendered as a newline with an unclosed block
//- - - - - - - - -//
> ```
> 0
//- - - - - - - - -//
<blockquote>
<pre><code> 0
</code></pre>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -8,20 +8,71 @@
## Title3 ## {#id_3 .class-3}
## Title4 ## {attr3=value3}
## Title4 ## {data-attr3=value3}
## Title5 ## {#id_5 attr5=value5}
## Title5 ## {#id_5 data-attr5=value5}
## Title6 ## {#id_6 .class6 attr6=value6}
## Title6 ## {#id_6 .class6 data-attr6=value6}
## Title7 ## {#id_7 attr7="value \"7"}
## Title7 ## {#id_7 data-attr7="value \"7"}
## Title8 {#id .className data-attrName=attrValue class="class1 class2"}
//- - - - - - - - -//
<h2 id="title-0">Title 0</h2>
<h2 id="id_1" class="class-1">Title1</h2>
<h2 id="id_2">Title2</h2>
<h2 id="id_3" class="class-3">Title3</h2>
<h2 attr3="value3" id="title4">Title4</h2>
<h2 id="id_5" attr5="value5">Title5</h2>
<h2 id="id_6" class="class6" attr6="value6">Title6</h2>
<h2 id="id_7" attr7="value &quot;7">Title7</h2>
<h2 data-attr3="value3" id="title4">Title4</h2>
<h2 id="id_5" data-attr5="value5">Title5</h2>
<h2 id="id_6" class="class6" data-attr6="value6">Title6</h2>
<h2 id="id_7" data-attr7="value &quot;7">Title7</h2>
<h2 id="id" class="className class1 class2" data-attrName="attrValue">Title8</h2>
//= = = = = = = = = = = = = = = = = = = = = = = =//
2
//- - - - - - - - -//
#
# FOO
//- - - - - - - - -//
<h1 id="heading"></h1>
<h1 id="foo">FOO</h1>
//= = = = = = = = = = = = = = = = = = = = = = = =//
3
//- - - - - - - - -//
## `records(self, zone, params={})`
//- - - - - - - - -//
<h2 id="recordsself-zone-params"><code>records(self, zone, params={})</code></h2>
//= = = = = = = = = = = = = = = = = = = = = = = =//
4
//- - - - - - - - -//
## Test {#hey .sort,class=fine,class=shell} Doesn't matter
//- - - - - - - - -//
<h2 id="test-hey-sortclassfineclassshell-doesnt-matter">Test {#hey .sort,class=fine,class=shell} Doesn't matter</h2>
//= = = = = = = = = = = = = = = = = = = = = = = =//
5
//- - - - - - - - -//
## Test ## {#hey .sort,class=fine,class=shell} Doesn't matter
//- - - - - - - - -//
<h2 id="test--hey-sortclassfineclassshell-doesnt-matter">Test ## {#hey .sort,class=fine,class=shell} Doesn't matter</h2>
//= = = = = = = = = = = = = = = = = = = = = = = =//
6: class must be a string
//- - - - - - - - -//
# Test ## {class=0#.}
//- - - - - - - - -//
<h1 id="test--class0">Test ## {class=0#.}</h1>
//= = = = = = = = = = = = = = = = = = = = = = = =//
7: short handed ids can contain hyphens ("-"), underscores ("_"), colons (":"), and periods (".")
//- - - - - - - - -//
# Test ## {#id-foo_bar:baz.qux .foobar}
//- - - - - - - - -//
<h1 id="id-foo_bar:baz.qux" class="foobar">Test</h1>
//= = = = = = = = = = = = = = = = = = = = = = = =//

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,61 @@
package main
import (
"archive/zip"
"encoding/json"
"io/ioutil"
"log"
"os"
"strconv"
"strings"
)
type TestCase struct {
Example int `json:"example"`
Markdown string `json:"markdown"`
}
func main() {
corpus_out := os.Args[1]
if !strings.HasSuffix(corpus_out, ".zip") {
log.Fatalln("Expected command line:", os.Args[0], "<corpus_output>.zip")
}
zip_file, err := os.Create(corpus_out)
zip_writer := zip.NewWriter(zip_file)
if err != nil {
log.Fatalln("Failed creating file:", err)
}
json_corpus := "_test/spec.json"
bs, err := ioutil.ReadFile(json_corpus)
if err != nil {
log.Fatalln("Could not open file:", json_corpus)
panic(err)
}
var testCases []TestCase
if err := json.Unmarshal(bs, &testCases); err != nil {
panic(err)
}
for _, c := range testCases {
file_in_zip := "example-" + strconv.Itoa(c.Example)
f, err := zip_writer.Create(file_in_zip)
if err != nil {
log.Fatal(err)
}
_, err = f.Write([]byte(c.Markdown))
if err != nil {
log.Fatalf("Failed to write file: %s into zip file", file_in_zip)
}
}
err = zip_writer.Close()
if err != nil {
log.Fatal("Failed to close zip writer", err)
}
zip_file.Close()
}

View file

@ -0,0 +1,73 @@
package main
import (
"bufio"
"bytes"
"fmt"
"io/ioutil"
"net/http"
"os"
"strconv"
"strings"
)
const outPath = "../util/unicode_case_folding.go"
type caseFolding struct {
Class byte
From rune
To []rune
}
func main() {
url := "http://www.unicode.org/Public/14.0.0/ucd/CaseFolding.txt"
resp, err := http.Get(url)
if err != nil {
fmt.Printf("Failed to get CaseFolding.txt: %v\n", err)
os.Exit(1)
}
defer resp.Body.Close()
bs, err := ioutil.ReadAll(resp.Body)
if err != nil {
fmt.Printf("Failed to get CaseFolding.txt: %v\n", err)
os.Exit(1)
}
buf := bytes.NewBuffer(bs)
scanner := bufio.NewScanner(buf)
f, err := os.Create(outPath)
if err != nil {
fmt.Printf("Failed to open %s: %v\n", outPath, err)
os.Exit(1)
}
defer f.Close()
_, _ = f.WriteString("package util\n\n")
_, _ = f.WriteString("var unicodeCaseFoldings = map[rune][]rune {\n")
for scanner.Scan() {
line := scanner.Text()
if strings.HasPrefix(line, "#") || len(strings.TrimSpace(line)) == 0 {
continue
}
line = strings.Split(line, "#")[0]
parts := strings.Split(line, ";")
for i, p := range parts {
parts[i] = strings.TrimSpace(p)
}
cf := caseFolding{}
v, _ := strconv.ParseInt(parts[0], 16, 32)
cf.From = rune(int32(v))
cf.Class = parts[1][0]
for _, v := range strings.Split(parts[2], " ") {
c, _ := strconv.ParseInt(v, 16, 32)
cf.To = append(cf.To, rune(int32(c)))
}
if cf.Class != 'C' && cf.Class != 'F' {
continue
}
fmt.Fprintf(f, " %#x : %#v,\n", cf.From, cf.To)
}
fmt.Fprintf(f, "}\n")
}

View file

@ -39,17 +39,12 @@ func NewNodeKind(name string) NodeKind {
return kindMax
}
// An Attribute is an attribute of the Node
// An Attribute is an attribute of the Node.
type Attribute struct {
Name []byte
Value interface{}
}
var attrNameIDS = []byte("#")
var attrNameID = []byte("id")
var attrNameClassS = []byte(".")
var attrNameClass = []byte("class")
// A Node interface defines basic AST node functionalities.
type Node interface {
// Type returns a type of this node.
@ -98,6 +93,9 @@ type Node interface {
// RemoveChildren removes all children from this node.
RemoveChildren(self Node)
// SortChildren sorts childrens by comparator.
SortChildren(comparator func(n1, n2 Node) int)
// ReplaceChild replace a node v1 with a node insertee.
// If v1 is not children of this node, ReplaceChild append a insetee to the
// tail of the children.
@ -113,6 +111,11 @@ type Node interface {
// tail of the children.
InsertAfter(self, v1, insertee Node)
// OwnerDocument returns this node's owner document.
// If this node is not a child of the Document node, OwnerDocument
// returns nil.
OwnerDocument() *Document
// Dump dumps an AST tree structure to stdout.
// This function completely aimed for debugging.
// level is a indent level. Implementer should indent informations with
@ -120,6 +123,12 @@ type Node interface {
Dump(source []byte, level int)
// Text returns text values of this node.
// This method is valid only for some inline nodes.
// If this node is a block node, Text returns a text value as reasonable as possible.
// Notice that there are no 'correct' text values for the block nodes.
// Result for the block nodes may be different from your expectation.
//
// Deprecated: Use other properties of the node to get the text value(i.e. Pragraph.Lines, Text.Value).
Text(source []byte) []byte
// HasBlankPreviousLines returns true if the row before this node is blank,
@ -166,7 +175,7 @@ type Node interface {
RemoveAttributes()
}
// A BaseNode struct implements the Node interface.
// A BaseNode struct implements the Node interface partialliy.
type BaseNode struct {
firstChild Node
lastChild Node
@ -233,16 +242,51 @@ func (n *BaseNode) RemoveChild(self, v Node) {
// RemoveChildren implements Node.RemoveChildren .
func (n *BaseNode) RemoveChildren(self Node) {
for c := n.firstChild; c != nil; c = c.NextSibling() {
for c := n.firstChild; c != nil; {
c.SetParent(nil)
c.SetPreviousSibling(nil)
next := c.NextSibling()
c.SetNextSibling(nil)
c = next
}
n.firstChild = nil
n.lastChild = nil
n.childCount = 0
}
// SortChildren implements Node.SortChildren.
func (n *BaseNode) SortChildren(comparator func(n1, n2 Node) int) {
var sorted Node
current := n.firstChild
for current != nil {
next := current.NextSibling()
if sorted == nil || comparator(sorted, current) >= 0 {
current.SetNextSibling(sorted)
if sorted != nil {
sorted.SetPreviousSibling(current)
}
sorted = current
sorted.SetPreviousSibling(nil)
} else {
c := sorted
for c.NextSibling() != nil && comparator(c.NextSibling(), current) < 0 {
c = c.NextSibling()
}
current.SetNextSibling(c.NextSibling())
current.SetPreviousSibling(c)
if c.NextSibling() != nil {
c.NextSibling().SetPreviousSibling(current)
}
c.SetNextSibling(current)
}
current = next
}
n.firstChild = sorted
for c := n.firstChild; c != nil; c = c.NextSibling() {
n.lastChild = c
}
}
// FirstChild implements Node.FirstChild .
func (n *BaseNode) FirstChild() Node {
return n.firstChild
@ -320,11 +364,34 @@ func (n *BaseNode) InsertBefore(self, v1, insertee Node) {
}
}
// Text implements Node.Text .
// OwnerDocument implements Node.OwnerDocument.
func (n *BaseNode) OwnerDocument() *Document {
d := n.Parent()
for {
p := d.Parent()
if p == nil {
if v, ok := d.(*Document); ok {
return v
}
break
}
d = p
}
return nil
}
// Text implements Node.Text .
//
// Deprecated: Use other properties of the node to get the text value(i.e. Pragraph.Lines, Text.Value).
func (n *BaseNode) Text(source []byte) []byte {
var buf bytes.Buffer
for c := n.firstChild; c != nil; c = c.NextSibling() {
buf.Write(c.Text(source))
if sb, ok := c.(interface {
SoftLineBreak() bool
}); ok && sb.SoftLineBreak() {
buf.WriteByte('\n')
}
}
return buf.Bytes()
}
@ -345,7 +412,7 @@ func (n *BaseNode) SetAttribute(name []byte, value interface{}) {
n.attributes = append(n.attributes, Attribute{name, value})
}
// SetAttributeString implements Node.SetAttributeString
// SetAttributeString implements Node.SetAttributeString.
func (n *BaseNode) SetAttributeString(name string, value interface{}) {
n.SetAttribute(util.StringToReadOnlyBytes(name), value)
}
@ -368,12 +435,12 @@ func (n *BaseNode) AttributeString(s string) (interface{}, bool) {
return n.Attribute(util.StringToReadOnlyBytes(s))
}
// Attributes implements Node.Attributes
// Attributes implements Node.Attributes.
func (n *BaseNode) Attributes() []Attribute {
return n.attributes
}
// RemoveAttributes implements Node.RemoveAttributes
// RemoveAttributes implements Node.RemoveAttributes.
func (n *BaseNode) RemoveAttributes() {
n.attributes = nil
}
@ -430,20 +497,25 @@ type Walker func(n Node, entering bool) (WalkStatus, error)
// Walk walks a AST tree by the depth first search algorithm.
func Walk(n Node, walker Walker) error {
_, err := walkHelper(n, walker)
return err
}
func walkHelper(n Node, walker Walker) (WalkStatus, error) {
status, err := walker(n, true)
if err != nil || status == WalkStop {
return err
return status, err
}
if status != WalkSkipChildren {
for c := n.FirstChild(); c != nil; c = c.NextSibling() {
if err := Walk(c, walker); err != nil {
return err
if st, err := walkHelper(c, walker); err != nil || st == WalkStop {
return WalkStop, err
}
}
}
status, err = walker(n, false)
if err != nil || status == WalkStop {
return err
return WalkStop, err
}
return nil
return WalkContinue, nil
}

60
ast/ast_test.go Normal file
View file

@ -0,0 +1,60 @@
package ast
import (
"reflect"
"testing"
)
func TestWalk(t *testing.T) {
tests := []struct {
name string
node Node
want []NodeKind
action map[NodeKind]WalkStatus
}{
{
"visits all in depth first order",
node(NewDocument(), node(NewHeading(1), NewText()), NewLink()),
[]NodeKind{KindDocument, KindHeading, KindText, KindLink},
map[NodeKind]WalkStatus{},
},
{
"stops after heading",
node(NewDocument(), node(NewHeading(1), NewText()), NewLink()),
[]NodeKind{KindDocument, KindHeading},
map[NodeKind]WalkStatus{KindHeading: WalkStop},
},
{
"skip children",
node(NewDocument(), node(NewHeading(1), NewText()), NewLink()),
[]NodeKind{KindDocument, KindHeading, KindLink},
map[NodeKind]WalkStatus{KindHeading: WalkSkipChildren},
},
}
for _, tt := range tests {
var kinds []NodeKind
collectKinds := func(n Node, entering bool) (WalkStatus, error) {
if entering {
kinds = append(kinds, n.Kind())
}
if status, ok := tt.action[n.Kind()]; ok {
return status, nil
}
return WalkContinue, nil
}
t.Run(tt.name, func(t *testing.T) {
if err := Walk(tt.node, collectKinds); err != nil {
t.Errorf("Walk() error = %v", err)
} else if !reflect.DeepEqual(kinds, tt.want) {
t.Errorf("Walk() expected = %v, got = %v", tt.want, kinds)
}
})
}
}
func node(n Node, children ...Node) Node {
for _, c := range children {
n.AppendChild(n, c)
}
return n
}

View file

@ -7,19 +7,19 @@ import (
textm "github.com/yuin/goldmark/text"
)
// A BaseBlock struct implements the Node interface.
// A BaseBlock struct implements the Node interface partialliy.
type BaseBlock struct {
BaseNode
blankPreviousLines bool
lines *textm.Segments
}
// Type implements Node.Type
// Type implements Node.Type.
func (b *BaseBlock) Type() NodeType {
return TypeBlock
}
// IsRaw implements Node.IsRaw
// IsRaw implements Node.IsRaw.
func (b *BaseBlock) IsRaw() bool {
return false
}
@ -34,7 +34,7 @@ func (b *BaseBlock) SetBlankPreviousLines(v bool) {
b.blankPreviousLines = v
}
// Lines implements Node.Lines
// Lines implements Node.Lines.
func (b *BaseBlock) Lines() *textm.Segments {
if b.lines == nil {
b.lines = textm.NewSegments()
@ -42,7 +42,7 @@ func (b *BaseBlock) Lines() *textm.Segments {
return b.lines
}
// SetLines implements Node.SetLines
// SetLines implements Node.SetLines.
func (b *BaseBlock) SetLines(v *textm.Segments) {
b.lines = v
}
@ -50,6 +50,8 @@ func (b *BaseBlock) SetLines(v *textm.Segments) {
// A Document struct is a root node of Markdown text.
type Document struct {
BaseBlock
meta map[string]interface{}
}
// KindDocument is a NodeKind of the Document node.
@ -70,10 +72,42 @@ func (n *Document) Kind() NodeKind {
return KindDocument
}
// OwnerDocument implements Node.OwnerDocument.
func (n *Document) OwnerDocument() *Document {
return n
}
// Meta returns metadata of this document.
func (n *Document) Meta() map[string]interface{} {
if n.meta == nil {
n.meta = map[string]interface{}{}
}
return n.meta
}
// SetMeta sets given metadata to this document.
func (n *Document) SetMeta(meta map[string]interface{}) {
if n.meta == nil {
n.meta = map[string]interface{}{}
}
for k, v := range meta {
n.meta[k] = v
}
}
// AddMeta adds given metadata to this document.
func (n *Document) AddMeta(key string, value interface{}) {
if n.meta == nil {
n.meta = map[string]interface{}{}
}
n.meta[key] = value
}
// NewDocument returns a new Document node.
func NewDocument() *Document {
return &Document{
BaseBlock: BaseBlock{},
meta: nil,
}
}
@ -96,6 +130,13 @@ func (n *TextBlock) Kind() NodeKind {
return KindTextBlock
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. TextBlock.Lines).
func (n *TextBlock) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewTextBlock returns a new TextBlock node.
func NewTextBlock() *TextBlock {
return &TextBlock{
@ -121,6 +162,13 @@ func (n *Paragraph) Kind() NodeKind {
return KindParagraph
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. Paragraph.Lines).
func (n *Paragraph) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewParagraph returns a new Paragraph node.
func NewParagraph() *Paragraph {
return &Paragraph{
@ -215,6 +263,13 @@ func (n *CodeBlock) Kind() NodeKind {
return KindCodeBlock
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. CodeBlock.Lines).
func (n *CodeBlock) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewCodeBlock returns a new CodeBlock node.
func NewCodeBlock() *CodeBlock {
return &CodeBlock{
@ -270,6 +325,13 @@ func (n *FencedCodeBlock) Kind() NodeKind {
return KindFencedCodeBlock
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. FencedCodeBlock.Lines).
func (n *FencedCodeBlock) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewFencedCodeBlock return a new FencedCodeBlock node.
func NewFencedCodeBlock(info *Text) *FencedCodeBlock {
return &FencedCodeBlock{
@ -303,15 +365,15 @@ func NewBlockquote() *Blockquote {
}
}
// A List structr represents a list of Markdown text.
// A List struct represents a list of Markdown text.
type List struct {
BaseBlock
// Marker is a markar character like '-', '+', ')' and '.'.
// Marker is a marker character like '-', '+', ')' and '.'.
Marker byte
// IsTight is a true if this list is a 'tight' list.
// See https://spec.commonmark.org/0.29/#loose for details.
// See https://spec.commonmark.org/0.30/#loose for details.
IsTight bool
// Start is an initial number of this ordered list.
@ -364,7 +426,7 @@ func NewList(marker byte) *List {
type ListItem struct {
BaseBlock
// Offset is an offset potision of this item.
// Offset is an offset position of this item.
Offset int
}
@ -393,23 +455,23 @@ func NewListItem(offset int) *ListItem {
}
// HTMLBlockType represents kinds of an html blocks.
// See https://spec.commonmark.org/0.29/#html-blocks
// See https://spec.commonmark.org/0.30/#html-blocks
type HTMLBlockType int
const (
// HTMLBlockType1 represents type 1 html blocks
// HTMLBlockType1 represents type 1 html blocks.
HTMLBlockType1 HTMLBlockType = iota + 1
// HTMLBlockType2 represents type 2 html blocks
// HTMLBlockType2 represents type 2 html blocks.
HTMLBlockType2
// HTMLBlockType3 represents type 3 html blocks
// HTMLBlockType3 represents type 3 html blocks.
HTMLBlockType3
// HTMLBlockType4 represents type 4 html blocks
// HTMLBlockType4 represents type 4 html blocks.
HTMLBlockType4
// HTMLBlockType5 represents type 5 html blocks
// HTMLBlockType5 represents type 5 html blocks.
HTMLBlockType5
// HTMLBlockType6 represents type 6 html blocks
// HTMLBlockType6 represents type 6 html blocks.
HTMLBlockType6
// HTMLBlockType7 represents type 7 html blocks
// HTMLBlockType7 represents type 7 html blocks.
HTMLBlockType7
)
@ -464,6 +526,17 @@ func (n *HTMLBlock) Kind() NodeKind {
return KindHTMLBlock
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. HTMLBlock.Lines).
func (n *HTMLBlock) Text(source []byte) []byte {
ret := n.Lines().Value(source)
if n.HasClosure() {
ret = append(ret, n.ClosureLine.Value(source)...)
}
return ret
}
// NewHTMLBlock returns a new HTMLBlock node.
func NewHTMLBlock(typ HTMLBlockType) *HTMLBlock {
return &HTMLBlock{

View file

@ -8,17 +8,17 @@ import (
"github.com/yuin/goldmark/util"
)
// A BaseInline struct implements the Node interface.
// A BaseInline struct implements the Node interface partialliy.
type BaseInline struct {
BaseNode
}
// Type implements Node.Type
// Type implements Node.Type.
func (b *BaseInline) Type() NodeType {
return TypeInline
}
// IsRaw implements Node.IsRaw
// IsRaw implements Node.IsRaw.
func (b *BaseInline) IsRaw() bool {
return false
}
@ -33,12 +33,12 @@ func (b *BaseInline) SetBlankPreviousLines(v bool) {
panic("can not call with inline nodes.")
}
// Lines implements Node.Lines
// Lines implements Node.Lines.
func (b *BaseInline) Lines() *textm.Segments {
panic("can not call with inline nodes.")
}
// SetLines implements Node.SetLines
// SetLines implements Node.SetLines.
func (b *BaseInline) SetLines(v *textm.Segments) {
panic("can not call with inline nodes.")
}
@ -91,7 +91,7 @@ func (n *Text) SetSoftLineBreak(v bool) {
if v {
n.flags |= textSoftLineBreak
} else {
n.flags = n.flags &^ textHardLineBreak
n.flags = n.flags &^ textSoftLineBreak
}
}
@ -111,7 +111,7 @@ func (n *Text) SetRaw(v bool) {
}
// HardLineBreak returns true if this node ends with a hard line break.
// See https://spec.commonmark.org/0.29/#hard-line-breaks for details.
// See https://spec.commonmark.org/0.30/#hard-line-breaks for details.
func (n *Text) HardLineBreak() bool {
return n.flags&textHardLineBreak != 0
}
@ -132,7 +132,8 @@ func (n *Text) Merge(node Node, source []byte) bool {
if !ok {
return false
}
if n.Segment.Stop != t.Segment.Start || t.Segment.Padding != 0 || source[n.Segment.Stop-1] == '\n' || t.IsRaw() != n.IsRaw() {
if n.Segment.Stop != t.Segment.Start || t.Segment.Padding != 0 ||
source[n.Segment.Stop-1] == '\n' || t.IsRaw() != n.IsRaw() {
return false
}
n.Segment.Stop = t.Segment.Stop
@ -142,17 +143,25 @@ func (n *Text) Merge(node Node, source []byte) bool {
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. Text.Value).
func (n *Text) Text(source []byte) []byte {
return n.Segment.Value(source)
}
// Value returns a value of this node.
// SoftLineBreaks are not included in the returned value.
func (n *Text) Value(source []byte) []byte {
return n.Segment.Value(source)
}
// Dump implements Node.Dump.
func (n *Text) Dump(source []byte, level int) {
fs := textFlagsString(n.flags)
if len(fs) != 0 {
fs = "(" + fs + ")"
}
fmt.Printf("%sText%s: \"%s\"\n", strings.Repeat(" ", level), fs, strings.TrimRight(string(n.Text(source)), "\n"))
fmt.Printf("%sText%s: \"%s\"\n", strings.Repeat(" ", level), fs, strings.TrimRight(string(n.Value(source)), "\n"))
}
// KindText is a NodeKind of the Text node.
@ -170,7 +179,7 @@ func NewText() *Text {
}
}
// NewTextSegment returns a new Text node with the given source potision.
// NewTextSegment returns a new Text node with the given source position.
func NewTextSegment(v textm.Segment) *Text {
return &Text{
BaseInline: BaseInline{},
@ -214,7 +223,7 @@ func MergeOrReplaceTextSegment(parent Node, n Node, s textm.Segment) {
}
}
// A String struct is a textual content that has a concrete value
// A String struct is a textual content that has a concrete value.
type String struct {
BaseInline
@ -257,6 +266,8 @@ func (n *String) SetCode(v bool) {
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. String.Value).
func (n *String) Text(source []byte) []byte {
return n.Value
}
@ -305,7 +316,7 @@ func (n *CodeSpan) IsBlank(source []byte) bool {
return true
}
// Dump implements Node.Dump
// Dump implements Node.Dump.
func (n *CodeSpan) Dump(source []byte, level int) {
DumpHelper(n, source, level, nil, nil)
}
@ -467,7 +478,7 @@ type AutoLink struct {
// Inline implements Inline.Inline.
func (n *AutoLink) Inline() {}
// Dump implenets Node.Dump
// Dump implements Node.Dump.
func (n *AutoLink) Dump(source []byte, level int) {
segment := n.value.Segment
m := map[string]string{
@ -491,15 +502,22 @@ func (n *AutoLink) URL(source []byte) []byte {
ret := make([]byte, 0, len(n.Protocol)+s.Len()+3)
ret = append(ret, n.Protocol...)
ret = append(ret, ':', '/', '/')
ret = append(ret, n.value.Text(source)...)
ret = append(ret, n.value.Value(source)...)
return ret
}
return n.value.Text(source)
return n.value.Value(source)
}
// Label returns a label of this node.
func (n *AutoLink) Label(source []byte) []byte {
return n.value.Text(source)
return n.value.Value(source)
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. AutoLink.Label).
func (n *AutoLink) Text(source []byte) []byte {
return n.value.Value(source)
}
// NewAutoLink returns a new AutoLink node.
@ -540,6 +558,13 @@ func (n *RawHTML) Kind() NodeKind {
return KindRawHTML
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. RawHTML.Segments).
func (n *RawHTML) Text(source []byte) []byte {
return n.Segments.Value(source)
}
// NewRawHTML returns a new RawHTML node.
func NewRawHTML() *RawHTML {
return &RawHTML{

204
ast_test.go Normal file
View file

@ -0,0 +1,204 @@
package goldmark_test
import (
"bytes"
"testing"
. "github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/text"
)
func TestASTBlockNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
T2 string
C bool
}{
{
Name: "AtxHeading",
Source: `# l1
a
# l2`,
T1: `l1`,
T2: `l2`,
},
{
Name: "SetextHeading",
Source: `l1
l2
===============
a
l3
l4
==============`,
T1: `l1
l2`,
T2: `l3
l4`,
},
{
Name: "CodeBlock",
Source: ` l1
l2
a
l3
l4`,
T1: `l1
l2
`,
T2: `l3
l4
`,
},
{
Name: "FencedCodeBlock",
Source: "```" + `
l1
l2
` + "```" + `
a
` + "```" + `
l3
l4`,
T1: `l1
l2
`,
T2: `l3
l4
`,
},
{
Name: "Blockquote",
Source: `> l1
> l2
a
> l3
> l4`,
T1: `l1
l2`,
T2: `l3
l4`,
},
{
Name: "List",
Source: `- l1
l2
a
- l3
l4`,
T1: `l1
l2`,
T2: `l3
l4`,
C: true,
},
{
Name: "HTMLBlock",
Source: `<div>
l1
l2
</div>
a
<div>
l3
l4`,
T1: `<div>
l1
l2
</div>
`,
T2: `<div>
l3
l4`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := New()
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild()
c2 := c1.NextSibling().NextSibling()
if cs.C {
c1 = c1.FirstChild()
c2 = c2.FirstChild()
}
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch: %s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
if !bytes.Equal(c2.Text(s), []byte(cs.T2)) { // nolint: staticcheck
t.Errorf("%s(EOF) unmatch: %s", cs.Name, testutil.DiffPretty(c2.Text(s), []byte(cs.T2))) // nolint: staticcheck
}
})
}
}
func TestASTInlineNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
}{
{
Name: "CodeSpan",
Source: "`c1`",
T1: `c1`,
},
{
Name: "Emphasis",
Source: `*c1 **c2***`,
T1: `c1 c2`,
},
{
Name: "Link",
Source: `[label](url)`,
T1: `label`,
},
{
Name: "AutoLink",
Source: `<http://url>`,
T1: `http://url`,
},
{
Name: "RawHTML",
Source: `<span>c1</span>`,
T1: `<span>`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := New()
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild().FirstChild()
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch:\n%s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
})
}
}

View file

@ -2,12 +2,12 @@ package goldmark_test
import (
"encoding/json"
"io/ioutil"
"os"
"testing"
. "github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
type commonmarkSpecTestCase struct {
@ -20,7 +20,7 @@ type commonmarkSpecTestCase struct {
}
func TestSpec(t *testing.T) {
bs, err := ioutil.ReadFile("_test/spec.json")
bs, err := os.ReadFile("_test/spec.json")
if err != nil {
panic(err)
}
@ -29,12 +29,25 @@ func TestSpec(t *testing.T) {
panic(err)
}
cases := []testutil.MarkdownTestCase{}
nos := testutil.ParseCliCaseArg()
for _, c := range testCases {
cases = append(cases, testutil.MarkdownTestCase{
No: c.Example,
Markdown: c.Markdown,
Expected: c.HTML,
})
shouldAdd := len(nos) == 0
if !shouldAdd {
for _, no := range nos {
if c.Example == no {
shouldAdd = true
break
}
}
}
if shouldAdd {
cases = append(cases, testutil.MarkdownTestCase{
No: c.Example,
Markdown: c.Markdown,
Expected: c.HTML,
})
}
}
markdown := New(WithRendererOptions(
html.WithXHTML(),

View file

@ -141,3 +141,17 @@ on two lines.</p>
</dl>
//= = = = = = = = = = = = = = = = = = = = = = = =//
6: Definition lists indented with tabs
//- - - - - - - - -//
0
: ```
0
//- - - - - - - - -//
<dl>
<dt>0</dt>
<dd><pre><code> 0
</code></pre>
</dd>
</dl>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -7,16 +7,85 @@ That's some text with a footnote.[^1]
That's the second paragraph.
//- - - - - - - - -//
<p>That's some text with a footnote.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<section class="footnotes" role="doc-endnotes">
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1" role="doc-endnote">
<li id="fn:1">
<p>And that's the footnote.</p>
<p>That's the second paragraph.</p>
<p>That's the second paragraph.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</section>
</div>
//= = = = = = = = = = = = = = = = = = = = = = = =//
3
//- - - - - - - - -//
[^000]:0 [^]:
//- - - - - - - - -//
//= = = = = = = = = = = = = = = = = = = = = = = =//
4
//- - - - - - - - -//
This[^3] is[^1] text with footnotes[^2].
[^1]: Footnote one
[^2]: Footnote two
[^3]: Footnote three
//- - - - - - - - -//
<p>This<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> is<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup> text with footnotes<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup>.</p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Footnote three&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Footnote one&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>Footnote two&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
//= = = = = = = = = = = = = = = = = = = = = = = =//
5
//- - - - - - - - -//
test![^1]
[^1]: footnote
//- - - - - - - - -//
<p>test!<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>footnote&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
//= = = = = = = = = = = = = = = = = = = = = = = =//
6: Multiple references to the same footnotes should have different ids
//- - - - - - - - -//
something[^fn:1]
something[^fn:1]
something[^fn:1]
[^fn:1]: footnote text
//- - - - - - - - -//
<p>something<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>something<sup id="fnref1:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<p>something<sup id="fnref2:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>footnote text&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref1:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a>&#160;<a href="#fnref2:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -112,3 +112,82 @@ a.b-c_d@a.b_
<p>a.b-c_d@a.b-</p>
<p>a.b-c_d@a.b_</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
11
//- - - - - - - - -//
https://github.com#sun,mon
//- - - - - - - - -//
<p><a href="https://github.com#sun,mon">https://github.com#sun,mon</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
12
//- - - - - - - - -//
https://github.com/sunday's
//- - - - - - - - -//
<p><a href="https://github.com/sunday's">https://github.com/sunday's</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
13
//- - - - - - - - -//
https://github.com?q=stars:>1
//- - - - - - - - -//
<p><a href="https://github.com?q=stars:%3E1">https://github.com?q=stars:&gt;1</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
14
//- - - - - - - - -//
[https://google.com](https://google.com)
//- - - - - - - - -//
<p><a href="https://google.com">https://google.com</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
15
//- - - - - - - - -//
This is a `git@github.com:vim/vim`
//- - - - - - - - -//
<p>This is a <code>git@github.com:vim/vim</code></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
16
//- - - - - - - - -//
https://nic.college
//- - - - - - - - -//
<p><a href="https://nic.college">https://nic.college</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
17
//- - - - - - - - -//
http://server.intranet.acme.com:1313
//- - - - - - - - -//
<p><a href="http://server.intranet.acme.com:1313">http://server.intranet.acme.com:1313</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
18
//- - - - - - - - -//
https://g.page/foo
//- - - - - - - - -//
<p><a href="https://g.page/foo">https://g.page/foo</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
19: Trailing punctuation (specifically, ?, !, ., ,, :, *, _, and ~) will not be considered part of the autolink
//- - - - - - - - -//
__http://test.com/~/a__
__http://test.com/~/__
__http://test.com/~__
__http://test.com/a/~__
//- - - - - - - - -//
<p><strong><a href="http://test.com/~/a">http://test.com/~/a</a></strong>
<strong><a href="http://test.com/~/">http://test.com/~/</a></strong>
<strong><a href="http://test.com/">http://test.com/</a>~</strong>
<strong><a href="http://test.com/a/">http://test.com/a/</a>~</strong></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -5,8 +5,6 @@
<p><del>Hi</del> Hello, world!</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
2
//- - - - - - - - -//
This ~~has a
@ -16,3 +14,26 @@ new paragraph~~.
<p>This ~~has a</p>
<p>new paragraph~~.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
3
//- - - - - - - - -//
~Hi~ Hello, world!
//- - - - - - - - -//
<p><del>Hi</del> Hello, world!</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
4: Three or more tildes do not create a strikethrough
//- - - - - - - - -//
This will ~~~not~~~ strike.
//- - - - - - - - -//
<p>This will ~~~not~~~ strike.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
5: Leading three or more tildes do not create a strikethrough, create a code block
//- - - - - - - - -//
~~~Hi~~~ Hello, world!
//- - - - - - - - -//
<pre><code class="language-Hi~~~"></code></pre>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -61,7 +61,7 @@ bar | baz
</thead>
<tbody>
<tr>
<td>b <code>\|</code> az</td>
<td>b <code>|</code> az</td>
</tr>
<tr>
<td>b <strong>|</strong> im</td>
@ -188,3 +188,95 @@ bar
</thead>
</table>
//= = = = = = = = = = = = = = = = = = = = = = = =//
9
//- - - - - - - - -//
Foo|Bar
---|---
`Yoyo`|Dyne
//- - - - - - - - -//
<table>
<thead>
<tr>
<th>Foo</th>
<th>Bar</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>Yoyo</code></td>
<td>Dyne</td>
</tr>
</tbody>
</table>
//= = = = = = = = = = = = = = = = = = = = = = = =//
10
//- - - - - - - - -//
foo|bar
---|---
`\` | second column
//- - - - - - - - -//
<table>
<thead>
<tr>
<th>foo</th>
<th>bar</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>\</code></td>
<td>second column</td>
</tr>
</tbody>
</table>
//= = = = = = = = = = = = = = = = = = = = = = = =//
11: Tables can interrupt paragraph
//- - - - - - - - -//
**xxx**
| hello | hi |
| :----: | :----:|
//- - - - - - - - -//
<p><strong>xxx</strong></p>
<table>
<thead>
<tr>
<th align="center">hello</th>
<th align="center">hi</th>
</tr>
</thead>
</table>
//= = = = = = = = = = = = = = = = = = = = = = = =//
12: A delimiter can not start with more than 3 spaces
//- - - - - - - - -//
Foo
---
//- - - - - - - - -//
<p>Foo
---</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
13: A delimiter can not start with more than 3 spaces(w/ tabs)
OPTIONS: {"enableEscape": true}
//- - - - - - - - -//
- aaa
Foo
\t\t---
//- - - - - - - - -//
<ul>
<li>
<p>aaa</p>
<p>Foo
---</p>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -4,8 +4,8 @@
- [x] bar
//- - - - - - - - -//
<ul>
<li><input disabled="" type="checkbox">foo</li>
<li><input checked="" disabled="" type="checkbox">bar</li>
<li><input disabled="" type="checkbox"> foo</li>
<li><input checked="" disabled="" type="checkbox"> bar</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
@ -19,12 +19,33 @@
- [ ] bim
//- - - - - - - - -//
<ul>
<li><input checked="" disabled="" type="checkbox">foo
<li><input checked="" disabled="" type="checkbox"> foo
<ul>
<li><input disabled="" type="checkbox">bar</li>
<li><input checked="" disabled="" type="checkbox">baz</li>
<li><input disabled="" type="checkbox"> bar</li>
<li><input checked="" disabled="" type="checkbox"> baz</li>
</ul>
</li>
<li><input disabled="" type="checkbox">bim</li>
<li><input disabled="" type="checkbox"> bim</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
3
//- - - - - - - - -//
- test[x]=[x]
//- - - - - - - - -//
<ul>
<li>test[x]=[x]</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
4
//- - - - - - - - -//
+ [x] [x]
//- - - - - - - - -//
<ul>
<li><input checked="" disabled="" type="checkbox"> [x]</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -18,3 +18,126 @@ This should "be" replaced
//- - - - - - - - -//
<p><strong>&ndash;</strong> <em>&mdash;</em> a&hellip;&laquo; b&raquo;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
4
//- - - - - - - - -//
Some say '90s, others say 90's, but I can't say which is best.
//- - - - - - - - -//
<p>Some say &rsquo;90s, others say 90&rsquo;s, but I can&rsquo;t say which is best.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
5: contractions
//- - - - - - - - -//
Alice's, I'm ,Don't, You'd
I've, I'll, You're
[Cat][]'s Pajamas
Yahoo!'s
[Cat]: http://example.com
//- - - - - - - - -//
<p>Alice&rsquo;s, I&rsquo;m ,Don&rsquo;t, You&rsquo;d</p>
<p>I&rsquo;ve, I&rsquo;ll, You&rsquo;re</p>
<p><a href="http://example.com">Cat</a>&rsquo;s Pajamas</p>
<p>Yahoo!&rsquo;s</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
6: "" after digits are an inch
//- - - - - - - - -//
My height is 5'6"".
//- - - - - - - - -//
<p>My height is 5'6&quot;&quot;.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
7: quote followed by ,.?! and spaces maybe a closer
//- - - - - - - - -//
reported "issue 1 (IE-only)", "issue 2", 'issue3 (FF-only)', 'issue4'
//- - - - - - - - -//
<p>reported &ldquo;issue 1 (IE-only)&rdquo;, &ldquo;issue 2&rdquo;, &lsquo;issue3 (FF-only)&rsquo;, &lsquo;issue4&rsquo;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
8: handle inches in qoutes
//- - - - - - - - -//
"Monitor 21"" and "Monitor""
//- - - - - - - - -//
<p>&ldquo;Monitor 21&quot;&rdquo; and &ldquo;Monitor&rdquo;&quot;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
9: Closing quotation marks within italics
//- - - - - - - - -//
*"At first, things were not clear."*
//- - - - - - - - -//
<p><em>&ldquo;At first, things were not clear.&rdquo;</em></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
10: Closing quotation marks within boldfacing
//- - - - - - - - -//
**"At first, things were not clear."**
//- - - - - - - - -//
<p><strong>&ldquo;At first, things were not clear.&rdquo;</strong></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
11: Closing quotation marks within boldfacing and italics
//- - - - - - - - -//
***"At first, things were not clear."***
//- - - - - - - - -//
<p><em><strong>&ldquo;At first, things were not clear.&rdquo;</strong></em></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
12: Closing quotation marks within boldfacing and italics
//- - - - - - - - -//
***"At first, things were not clear."***
//- - - - - - - - -//
<p><em><strong>&ldquo;At first, things were not clear.&rdquo;</strong></em></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
13: Plural possessives
//- - - - - - - - -//
John's dog is named Sam. The Smiths' dog is named Rover.
//- - - - - - - - -//
<p>John&rsquo;s dog is named Sam. The Smiths&rsquo; dog is named Rover.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
14: Links within quotation marks and parenthetical phrases
//- - - - - - - - -//
This is not difficult (see "[Introduction to Hugo Templating](https://gohugo.io/templates/introduction/)").
//- - - - - - - - -//
<p>This is not difficult (see &ldquo;<a href="https://gohugo.io/templates/introduction/">Introduction to Hugo Templating</a>&rdquo;).</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
15: Quotation marks within links
//- - - - - - - - -//
Apple's early Cairo font gave us ["moof" and the "dogcow."](https://www.macworld.com/article/2926184/we-miss-you-clarus-the-dogcow.html)
//- - - - - - - - -//
<p>Apple&rsquo;s early Cairo font gave us <a href="https://www.macworld.com/article/2926184/we-miss-you-clarus-the-dogcow.html">&ldquo;moof&rdquo; and the &ldquo;dogcow.&rdquo;</a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
16: Single closing quotation marks with slang/informalities
//- - - - - - - - -//
"I'm not doin' that," Bill said with emphasis.
//- - - - - - - - -//
<p>&ldquo;I&rsquo;m not doin&rsquo; that,&rdquo; Bill said with emphasis.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
17: Closing single quotation marks in quotations-within-quotations
//- - - - - - - - -//
Janet said, "When everything is 'breaking news,' nothing is 'breaking news.'"
//- - - - - - - - -//
<p>Janet said, &ldquo;When everything is &lsquo;breaking news,&rsquo; nothing is &lsquo;breaking news.&rsquo;&rdquo;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
18: Opening single quotation marks for abbreviations
//- - - - - - - - -//
We're talking about the internet --- 'net for short. Let's rock 'n roll!
//- - - - - - - - -//
<p>We&rsquo;re talking about the internet &mdash; &rsquo;net for short. Let&rsquo;s rock &rsquo;n roll!</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
19: Quotes in alt text
//- - - - - - - - -//
![Nice & day, **isn't** it?](https://example.com/image.jpg)
//- - - - - - - - -//
<p><img src="https://example.com/image.jpg" alt="Nice &amp; day, isn&rsquo;t it?"></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -2,6 +2,7 @@ package ast
import (
"fmt"
gast "github.com/yuin/goldmark/ast"
)
@ -9,13 +10,17 @@ import (
// (PHP Markdown Extra) text.
type FootnoteLink struct {
gast.BaseInline
Index int
Index int
RefCount int
RefIndex int
}
// Dump implements Node.Dump.
func (n *FootnoteLink) Dump(source []byte, level int) {
m := map[string]string{}
m["Index"] = fmt.Sprintf("%v", n.Index)
m["RefCount"] = fmt.Sprintf("%v", n.RefCount)
m["RefIndex"] = fmt.Sprintf("%v", n.RefIndex)
gast.DumpHelper(n, source, level, m, nil)
}
@ -30,7 +35,44 @@ func (n *FootnoteLink) Kind() gast.NodeKind {
// NewFootnoteLink returns a new FootnoteLink node.
func NewFootnoteLink(index int) *FootnoteLink {
return &FootnoteLink{
Index: index,
Index: index,
RefCount: 0,
RefIndex: 0,
}
}
// A FootnoteBacklink struct represents a link to a footnote of Markdown
// (PHP Markdown Extra) text.
type FootnoteBacklink struct {
gast.BaseInline
Index int
RefCount int
RefIndex int
}
// Dump implements Node.Dump.
func (n *FootnoteBacklink) Dump(source []byte, level int) {
m := map[string]string{}
m["Index"] = fmt.Sprintf("%v", n.Index)
m["RefCount"] = fmt.Sprintf("%v", n.RefCount)
m["RefIndex"] = fmt.Sprintf("%v", n.RefIndex)
gast.DumpHelper(n, source, level, m, nil)
}
// KindFootnoteBacklink is a NodeKind of the FootnoteBacklink node.
var KindFootnoteBacklink = gast.NewNodeKind("FootnoteBacklink")
// Kind implements Node.Kind.
func (n *FootnoteBacklink) Kind() gast.NodeKind {
return KindFootnoteBacklink
}
// NewFootnoteBacklink returns a new FootnoteBacklink node.
func NewFootnoteBacklink(index int) *FootnoteBacklink {
return &FootnoteBacklink{
Index: index,
RefCount: 0,
RefIndex: 0,
}
}
@ -44,7 +86,10 @@ type Footnote struct {
// Dump implements Node.Dump.
func (n *Footnote) Dump(source []byte, level int) {
gast.DumpHelper(n, source, level, nil, nil)
m := map[string]string{}
m["Index"] = fmt.Sprintf("%v", n.Index)
m["Ref"] = string(n.Ref)
gast.DumpHelper(n, source, level, m, nil)
}
// KindFootnote is a NodeKind of the Footnote node.
@ -58,7 +103,8 @@ func (n *Footnote) Kind() gast.NodeKind {
// NewFootnote returns a new Footnote node.
func NewFootnote(ref []byte) *Footnote {
return &Footnote{
Ref: ref,
Ref: ref,
Index: -1,
}
}
@ -66,11 +112,14 @@ func NewFootnote(ref []byte) *Footnote {
// (PHP Markdown Extra) text.
type FootnoteList struct {
gast.BaseBlock
Count int
}
// Dump implements Node.Dump.
func (n *FootnoteList) Dump(source []byte, level int) {
gast.DumpHelper(n, source, level, nil, nil)
m := map[string]string{}
m["Count"] = fmt.Sprintf("%v", n.Count)
gast.DumpHelper(n, source, level, m, nil)
}
// KindFootnoteList is a NodeKind of the FootnoteList node.
@ -83,5 +132,7 @@ func (n *FootnoteList) Kind() gast.NodeKind {
// NewFootnoteList returns a new FootnoteList node.
func NewFootnoteList() *FootnoteList {
return &FootnoteList{}
return &FootnoteList{
Count: 0,
}
}

View file

@ -2,8 +2,9 @@ package ast
import (
"fmt"
gast "github.com/yuin/goldmark/ast"
"strings"
gast "github.com/yuin/goldmark/ast"
)
// Alignment is a text alignment of table cells.
@ -45,7 +46,7 @@ type Table struct {
Alignments []Alignment
}
// Dump implements Node.Dump
// Dump implements Node.Dump.
func (n *Table) Dump(source []byte, level int) {
gast.DumpHelper(n, source, level, nil, func(level int) {
indent := strings.Repeat(" ", level)
@ -97,7 +98,7 @@ func (n *TableRow) Kind() gast.NodeKind {
// NewTableRow returns a new TableRow node.
func NewTableRow(alignments []Alignment) *TableRow {
return &TableRow{}
return &TableRow{Alignments: alignments}
}
// A TableHeader struct represents a table header of Markdown(GFM) text.

View file

@ -11,7 +11,7 @@ type TaskCheckBox struct {
IsChecked bool
}
// Dump impelemtns Node.Dump.
// Dump implements Node.Dump.
func (n *TaskCheckBox) Dump(source []byte, level int) {
m := map[string]string{
"Checked": fmt.Sprintf("%v", n.IsChecked),

123
extension/ast_test.go Normal file
View file

@ -0,0 +1,123 @@
package extension
import (
"bytes"
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/text"
)
func TestASTBlockNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
T2 string
C bool
}{
{
Name: "DefinitionList",
Source: `c1
: c2
c3
a
c4
: c5
c6`,
T1: `c1c2
c3`,
T2: `c4c5
c6`,
},
{
Name: "Table",
Source: `| h1 | h2 |
| -- | -- |
| c1 | c2 |
a
| h3 | h4 |
| -- | -- |
| c3 | c4 |`,
T1: `h1h2c1c2`,
T2: `h3h4c3c4`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
DefinitionList,
Table,
),
)
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild()
c2 := c1.NextSibling().NextSibling()
if cs.C {
c1 = c1.FirstChild()
c2 = c2.FirstChild()
}
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch:\n%s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
if !bytes.Equal(c2.Text(s), []byte(cs.T2)) { // nolint: staticcheck
t.Errorf("%s(EOF) unmatch: %s", cs.Name, testutil.DiffPretty(c2.Text(s), []byte(cs.T2))) // nolint: staticcheck
}
})
}
}
func TestASTInlineNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
}{
{
Name: "Strikethrough",
Source: `~c1 *c2*~`,
T1: `c1 c2`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
Strikethrough,
),
)
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild().FirstChild()
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch:\n%s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
})
}
}

72
extension/cjk.go Normal file
View file

@ -0,0 +1,72 @@
package extension
import (
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
)
// A CJKOption sets options for CJK support mostly for HTML based renderers.
type CJKOption func(*cjk)
// A EastAsianLineBreaks is a style of east asian line breaks.
type EastAsianLineBreaks int
const (
//EastAsianLineBreaksNone renders line breaks as it is.
EastAsianLineBreaksNone EastAsianLineBreaks = iota
// EastAsianLineBreaksSimple is a style where soft line breaks are ignored
// if both sides of the break are east asian wide characters.
EastAsianLineBreaksSimple
// EastAsianLineBreaksCSS3Draft is a style where soft line breaks are ignored
// even if only one side of the break is an east asian wide character.
EastAsianLineBreaksCSS3Draft
)
// WithEastAsianLineBreaks is a functional option that indicates whether softline breaks
// between east asian wide characters should be ignored.
// style defauts to [EastAsianLineBreaksSimple] .
func WithEastAsianLineBreaks(style ...EastAsianLineBreaks) CJKOption {
return func(c *cjk) {
if len(style) == 0 {
c.EastAsianLineBreaks = EastAsianLineBreaksSimple
return
}
c.EastAsianLineBreaks = style[0]
}
}
// WithEscapedSpace is a functional option that indicates that a '\' escaped half-space(0x20) should not be rendered.
func WithEscapedSpace() CJKOption {
return func(c *cjk) {
c.EscapedSpace = true
}
}
type cjk struct {
EastAsianLineBreaks EastAsianLineBreaks
EscapedSpace bool
}
// CJK is a goldmark extension that provides functionalities for CJK languages.
var CJK = NewCJK(WithEastAsianLineBreaks(), WithEscapedSpace())
// NewCJK returns a new extension with given options.
func NewCJK(opts ...CJKOption) goldmark.Extender {
e := &cjk{
EastAsianLineBreaks: EastAsianLineBreaksNone,
}
for _, opt := range opts {
opt(e)
}
return e
}
func (e *cjk) Extend(m goldmark.Markdown) {
m.Renderer().AddOptions(html.WithEastAsianLineBreaks(
html.EastAsianLineBreaks(e.EastAsianLineBreaks)))
if e.EscapedSpace {
m.Renderer().AddOptions(html.WithWriter(html.NewWriter(html.WithEscapedSpace())))
m.Parser().AddOptions(parser.WithEscapedSpace())
}
}

269
extension/cjk_test.go Normal file
View file

@ -0,0 +1,269 @@
package extension
import (
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
func TestEscapedSpace(t *testing.T) {
markdown := goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
no := 1
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Without spaces around an emphasis started with east asian punctuations, it is not interpreted as an emphasis(as defined in CommonMark spec)",
Markdown: "太郎は**「こんにちわ」**と言った\nんです",
Expected: "<p>太郎は**「こんにちわ」**と言った\nんです</p>",
},
t,
)
no = 2
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "With spaces around an emphasis started with east asian punctuations, it is interpreted as an emphasis(but remains unnecessary spaces)",
Markdown: "太郎は **「こんにちわ」** と言った\nんです",
Expected: "<p>太郎は <strong>「こんにちわ」</strong> と言った\nんです</p>",
},
t,
)
// Enables EscapedSpace
markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(NewCJK(WithEscapedSpace())),
)
no = 3
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "With spaces around an emphasis started with east asian punctuations,it is interpreted as an emphasis",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nんです",
Expected: "<p>太郎は<strong>「こんにちわ」</strong>と言った\nんです</p>",
},
t,
)
// ' ' triggers Linkify extension inline parser.
// Escaped spaces should not trigger the inline parser.
markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewCJK(WithEscapedSpace()),
Linkify,
),
)
no = 4
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Escaped space and linkfy extension",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nんです",
Expected: "<p>太郎は<strong>「こんにちわ」</strong>と言った\nんです</p>",
},
t,
)
}
func TestEastAsianLineBreaks(t *testing.T) {
markdown := goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
no := 1
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks are rendered as a newline, so some asian users will see it as an unnecessary space",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言った\nんです</p>",
},
t,
)
// Enables EastAsianLineBreaks
markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(NewCJK(WithEastAsianLineBreaks())),
)
no = 2
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between east asian wide characters are ignored",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったんです</p>",
},
t,
)
no = 3
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between western characters are rendered as a newline",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言ったa\nbんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったa\nbんです</p>",
},
t,
)
no = 4
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between a western character and an east asian wide character are rendered as a newline",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言ったa\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったa\nんです</p>",
},
t,
)
no = 5
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between an east asian wide character and a western character are rendered as a newline",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nbんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言った\nbんです</p>",
},
t,
)
// WithHardWraps take precedence over WithEastAsianLineBreaks
markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithHardWraps(),
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(NewCJK(WithEastAsianLineBreaks())),
)
no = 6
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "WithHardWraps take precedence over WithEastAsianLineBreaks",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言った<br />\nんです</p>",
},
t,
)
// Tests with EastAsianLineBreaksStyleSimple
markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewCJK(WithEastAsianLineBreaks()),
Linkify,
),
)
no = 7
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "WithEastAsianLineBreaks and linkfy extension",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\r\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったんです</p>",
},
t,
)
no = 8
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between east asian wide characters or punctuations are ignored",
Markdown: "太郎は\\ **「こんにちわ」**\\ と、\r\n言った\r\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と、言ったんです</p>",
},
t,
)
no = 9
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between an east asian wide character and a western character are ignored",
Markdown: "私はプログラマーです。\n東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。",
Expected: "<p>私はプログラマーです。東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。</p>",
},
t,
)
// Tests with EastAsianLineBreaksCSS3Draft
markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewCJK(WithEastAsianLineBreaks(EastAsianLineBreaksCSS3Draft)),
),
)
no = 10
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between a western character and an east asian wide character are ignored",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言ったa\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったaんです</p>",
},
t,
)
no = 11
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between an east asian wide character and a western character are ignored",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nbんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったbんです</p>",
},
t,
)
no = 12
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between an east asian wide character and a western character are ignored",
Markdown: "私はプログラマーです。\n東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。",
Expected: "<p>私はプログラマーです。東京の会社に勤めています。GoでWebアプリケーションを開発しています。</p>",
},
t,
)
}

View file

@ -113,7 +113,8 @@ func (b *definitionDescriptionParser) Trigger() []byte {
return []byte{':'}
}
func (b *definitionDescriptionParser) Open(parent gast.Node, reader text.Reader, pc parser.Context) (gast.Node, parser.State) {
func (b *definitionDescriptionParser) Open(
parent gast.Node, reader text.Reader, pc parser.Context) (gast.Node, parser.State) {
line, _ := reader.PeekLine()
pos := pc.BlockOffset()
indent := pc.BlockIndent()
@ -138,7 +139,7 @@ func (b *definitionDescriptionParser) Open(parent gast.Node, reader text.Reader,
para.Parent().RemoveChild(para.Parent(), para)
}
cpos, padding := util.IndentPosition(line[pos+1:], pos+1, list.Offset-pos-1)
reader.AdvanceAndSetPadding(cpos, padding)
reader.AdvanceAndSetPadding(cpos+1, padding)
return ast.NewDefinitionDescription(), parser.HasChildren
}
@ -196,31 +197,59 @@ func (r *DefinitionListHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFunc
reg.Register(ast.KindDefinitionDescription, r.renderDefinitionDescription)
}
func (r *DefinitionListHTMLRenderer) renderDefinitionList(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
// DefinitionListAttributeFilter defines attribute names which dl elements can have.
var DefinitionListAttributeFilter = html.GlobalAttributeFilter
func (r *DefinitionListHTMLRenderer) renderDefinitionList(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<dl>\n")
if n.Attributes() != nil {
_, _ = w.WriteString("<dl")
html.RenderAttributes(w, n, DefinitionListAttributeFilter)
_, _ = w.WriteString(">\n")
} else {
_, _ = w.WriteString("<dl>\n")
}
} else {
_, _ = w.WriteString("</dl>\n")
}
return gast.WalkContinue, nil
}
func (r *DefinitionListHTMLRenderer) renderDefinitionTerm(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
// DefinitionTermAttributeFilter defines attribute names which dd elements can have.
var DefinitionTermAttributeFilter = html.GlobalAttributeFilter
func (r *DefinitionListHTMLRenderer) renderDefinitionTerm(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<dt>")
if n.Attributes() != nil {
_, _ = w.WriteString("<dt")
html.RenderAttributes(w, n, DefinitionTermAttributeFilter)
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("<dt>")
}
} else {
_, _ = w.WriteString("</dt>\n")
}
return gast.WalkContinue, nil
}
func (r *DefinitionListHTMLRenderer) renderDefinitionDescription(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
// DefinitionDescriptionAttributeFilter defines attribute names which dd elements can have.
var DefinitionDescriptionAttributeFilter = html.GlobalAttributeFilter
func (r *DefinitionListHTMLRenderer) renderDefinitionDescription(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
n := node.(*ast.DefinitionDescription)
_, _ = w.WriteString("<dd")
if n.Attributes() != nil {
html.RenderAttributes(w, n, DefinitionDescriptionAttributeFilter)
}
if n.IsTight {
_, _ = w.WriteString("<dd>")
_, _ = w.WriteString(">")
} else {
_, _ = w.WriteString("<dd>\n")
_, _ = w.WriteString(">\n")
}
} else {
_, _ = w.WriteString("</dd>\n")

View file

@ -4,8 +4,8 @@ import (
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
func TestDefinitionList(t *testing.T) {
@ -17,5 +17,5 @@ func TestDefinitionList(t *testing.T) {
DefinitionList,
),
)
testutil.DoTestCaseFile(markdown, "_test/definition_list.txt", t)
testutil.DoTestCaseFile(markdown, "_test/definition_list.txt", t, testutil.ParseCliCaseArg()...)
}

View file

@ -2,6 +2,9 @@ package extension
import (
"bytes"
"fmt"
"strconv"
"github.com/yuin/goldmark"
gast "github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/extension/ast"
@ -10,10 +13,10 @@ import (
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"strconv"
)
var footnoteListKey = parser.NewContextKey()
var footnoteLinkListKey = parser.NewContextKey()
type footnoteBlockParser struct {
}
@ -41,30 +44,30 @@ func (b *footnoteBlockParser) Open(parent gast.Node, reader text.Reader, pc pars
return nil, parser.NoChildren
}
open := pos + 1
closes := 0
closure := util.FindClosure(line[pos+1:], '[', ']', false, false)
var closes int
closure := util.FindClosure(line[pos+1:], '[', ']', false, false) //nolint:staticcheck
closes = pos + 1 + closure
next := closes + 1
if closure > -1 {
closes = pos + 1 + closure
next := closes + 1
if next >= len(line) || line[next] != ':' {
return nil, parser.NoChildren
}
} else {
return nil, parser.NoChildren
}
label := reader.Value(text.NewSegment(segment.Start+open, segment.Start+closes))
padding := segment.Padding
label := reader.Value(text.NewSegment(segment.Start+open-padding, segment.Start+closes-padding))
if util.IsBlank(label) {
return nil, parser.NoChildren
}
item := ast.NewFootnote(label)
pos = pos + 2 + closes - open + 2
pos = next + 1 - padding
if pos >= len(line) {
reader.Advance(pos)
return item, parser.NoChildren
}
childpos, padding := util.IndentPosition(line[pos:], pos, 1)
reader.AdvanceAndSetPadding(pos+childpos, padding)
reader.AdvanceAndSetPadding(pos, padding)
return item, parser.HasChildren
}
@ -91,9 +94,6 @@ func (b *footnoteBlockParser) Close(node gast.Node, reader text.Reader, pc parse
node.Parent().InsertBefore(node.Parent(), node, list)
}
node.Parent().RemoveChild(node.Parent(), node)
n := node.(*ast.Footnote)
index := list.ChildCount() + 1
n.Index = index
list.AppendChild(list, node)
}
@ -117,12 +117,17 @@ func NewFootnoteParser() parser.InlineParser {
}
func (s *footnoteParser) Trigger() []byte {
return []byte{'['}
// footnote syntax probably conflict with the image syntax.
// So we need trigger this parser with '!'.
return []byte{'!', '['}
}
func (s *footnoteParser) Parse(parent gast.Node, block text.Reader, pc parser.Context) gast.Node {
line, segment := block.PeekLine()
pos := 1
if len(line) > 0 && line[0] == '!' {
pos++
}
if pos >= len(line) || line[pos] != '^' {
return nil
}
@ -131,7 +136,7 @@ func (s *footnoteParser) Parse(parent gast.Node, block text.Reader, pc parser.Co
return nil
}
open := pos
closure := util.FindClosure(line[pos:], '[', ']', false, false)
closure := util.FindClosure(line[pos:], '[', ']', false, false) //nolint:staticcheck
if closure < 0 {
return nil
}
@ -140,7 +145,7 @@ func (s *footnoteParser) Parse(parent gast.Node, block text.Reader, pc parser.Co
block.Advance(closes + 1)
var list *ast.FootnoteList
if tlist := pc.Root().Get(footnoteListKey); tlist != nil {
if tlist := pc.Get(footnoteListKey); tlist != nil {
list = tlist.(*ast.FootnoteList)
}
if list == nil {
@ -150,6 +155,10 @@ func (s *footnoteParser) Parse(parent gast.Node, block text.Reader, pc parser.Co
for def := list.FirstChild(); def != nil; def = def.NextSibling() {
d := def.(*ast.Footnote)
if bytes.Equal(d.Ref, value) {
if d.Index < 0 {
list.Count++
d.Index = list.Count
}
index = d.Index
break
}
@ -158,7 +167,20 @@ func (s *footnoteParser) Parse(parent gast.Node, block text.Reader, pc parser.Co
return nil
}
return ast.NewFootnoteLink(index)
fnlink := ast.NewFootnoteLink(index)
var fnlist []*ast.FootnoteLink
if tmp := pc.Get(footnoteLinkListKey); tmp != nil {
fnlist = tmp.([]*ast.FootnoteLink)
} else {
fnlist = []*ast.FootnoteLink{}
pc.Set(footnoteLinkListKey, fnlist)
}
pc.Set(footnoteLinkListKey, append(fnlist, fnlink))
if line[0] == '!' {
parent.AppendChild(parent, gast.NewTextSegment(text.NewSegment(segment.Start, segment.Start+1)))
}
return fnlink
}
type footnoteASTTransformer struct {
@ -174,28 +196,323 @@ func NewFootnoteASTTransformer() parser.ASTTransformer {
func (a *footnoteASTTransformer) Transform(node *gast.Document, reader text.Reader, pc parser.Context) {
var list *ast.FootnoteList
if tlist := pc.Get(footnoteListKey); tlist != nil {
list = tlist.(*ast.FootnoteList)
} else {
var fnlist []*ast.FootnoteLink
if tmp := pc.Get(footnoteListKey); tmp != nil {
list = tmp.(*ast.FootnoteList)
}
if tmp := pc.Get(footnoteLinkListKey); tmp != nil {
fnlist = tmp.([]*ast.FootnoteLink)
}
pc.Set(footnoteListKey, nil)
pc.Set(footnoteLinkListKey, nil)
if list == nil {
return
}
pc.Set(footnoteListKey, nil)
counter := map[int]int{}
if fnlist != nil {
for _, fnlink := range fnlist {
if fnlink.Index >= 0 {
counter[fnlink.Index]++
}
}
refCounter := map[int]int{}
for _, fnlink := range fnlist {
fnlink.RefCount = counter[fnlink.Index]
if _, ok := refCounter[fnlink.Index]; !ok {
refCounter[fnlink.Index] = 0
}
fnlink.RefIndex = refCounter[fnlink.Index]
refCounter[fnlink.Index]++
}
}
for footnote := list.FirstChild(); footnote != nil; {
var container gast.Node = footnote
next := footnote.NextSibling()
if fc := container.LastChild(); fc != nil && gast.IsParagraph(fc) {
container = fc
}
fn := footnote.(*ast.Footnote)
index := fn.Index
if index < 0 {
list.RemoveChild(list, footnote)
} else {
refCount := counter[index]
backLink := ast.NewFootnoteBacklink(index)
backLink.RefCount = refCount
backLink.RefIndex = 0
container.AppendChild(container, backLink)
if refCount > 1 {
for i := 1; i < refCount; i++ {
backLink := ast.NewFootnoteBacklink(index)
backLink.RefCount = refCount
backLink.RefIndex = i
container.AppendChild(container, backLink)
}
}
}
footnote = next
}
list.SortChildren(func(n1, n2 gast.Node) int {
if n1.(*ast.Footnote).Index < n2.(*ast.Footnote).Index {
return -1
}
return 1
})
if list.Count <= 0 {
list.Parent().RemoveChild(list.Parent(), list)
return
}
node.AppendChild(node, list)
}
// FootnoteConfig holds configuration values for the footnote extension.
//
// Link* and Backlink* configurations have some variables:
// Occurrences of “^^” in the string will be replaced by the
// corresponding footnote number in the HTML output.
// Occurrences of “%%” will be replaced by a number for the
// reference (footnotes can have multiple references).
type FootnoteConfig struct {
html.Config
// IDPrefix is a prefix for the id attributes generated by footnotes.
IDPrefix []byte
// IDPrefix is a function that determines the id attribute for given Node.
IDPrefixFunction func(gast.Node) []byte
// LinkTitle is an optional title attribute for footnote links.
LinkTitle []byte
// BacklinkTitle is an optional title attribute for footnote backlinks.
BacklinkTitle []byte
// LinkClass is a class for footnote links.
LinkClass []byte
// BacklinkClass is a class for footnote backlinks.
BacklinkClass []byte
// BacklinkHTML is an HTML content for footnote backlinks.
BacklinkHTML []byte
}
// FootnoteOption interface is a functional option interface for the extension.
type FootnoteOption interface {
renderer.Option
// SetFootnoteOption sets given option to the extension.
SetFootnoteOption(*FootnoteConfig)
}
// NewFootnoteConfig returns a new Config with defaults.
func NewFootnoteConfig() FootnoteConfig {
return FootnoteConfig{
Config: html.NewConfig(),
LinkTitle: []byte(""),
BacklinkTitle: []byte(""),
LinkClass: []byte("footnote-ref"),
BacklinkClass: []byte("footnote-backref"),
BacklinkHTML: []byte("&#x21a9;&#xfe0e;"),
}
}
// SetOption implements renderer.SetOptioner.
func (c *FootnoteConfig) SetOption(name renderer.OptionName, value interface{}) {
switch name {
case optFootnoteIDPrefixFunction:
c.IDPrefixFunction = value.(func(gast.Node) []byte)
case optFootnoteIDPrefix:
c.IDPrefix = value.([]byte)
case optFootnoteLinkTitle:
c.LinkTitle = value.([]byte)
case optFootnoteBacklinkTitle:
c.BacklinkTitle = value.([]byte)
case optFootnoteLinkClass:
c.LinkClass = value.([]byte)
case optFootnoteBacklinkClass:
c.BacklinkClass = value.([]byte)
case optFootnoteBacklinkHTML:
c.BacklinkHTML = value.([]byte)
default:
c.Config.SetOption(name, value)
}
}
type withFootnoteHTMLOptions struct {
value []html.Option
}
func (o *withFootnoteHTMLOptions) SetConfig(c *renderer.Config) {
if o.value != nil {
for _, v := range o.value {
v.(renderer.Option).SetConfig(c)
}
}
}
func (o *withFootnoteHTMLOptions) SetFootnoteOption(c *FootnoteConfig) {
if o.value != nil {
for _, v := range o.value {
v.SetHTMLOption(&c.Config)
}
}
}
// WithFootnoteHTMLOptions is functional option that wraps goldmark HTMLRenderer options.
func WithFootnoteHTMLOptions(opts ...html.Option) FootnoteOption {
return &withFootnoteHTMLOptions{opts}
}
const optFootnoteIDPrefix renderer.OptionName = "FootnoteIDPrefix"
type withFootnoteIDPrefix struct {
value []byte
}
func (o *withFootnoteIDPrefix) SetConfig(c *renderer.Config) {
c.Options[optFootnoteIDPrefix] = o.value
}
func (o *withFootnoteIDPrefix) SetFootnoteOption(c *FootnoteConfig) {
c.IDPrefix = o.value
}
// WithFootnoteIDPrefix is a functional option that is a prefix for the id attributes generated by footnotes.
func WithFootnoteIDPrefix[T []byte | string](a T) FootnoteOption {
return &withFootnoteIDPrefix{[]byte(a)}
}
const optFootnoteIDPrefixFunction renderer.OptionName = "FootnoteIDPrefixFunction"
type withFootnoteIDPrefixFunction struct {
value func(gast.Node) []byte
}
func (o *withFootnoteIDPrefixFunction) SetConfig(c *renderer.Config) {
c.Options[optFootnoteIDPrefixFunction] = o.value
}
func (o *withFootnoteIDPrefixFunction) SetFootnoteOption(c *FootnoteConfig) {
c.IDPrefixFunction = o.value
}
// WithFootnoteIDPrefixFunction is a functional option that is a prefix for the id attributes generated by footnotes.
func WithFootnoteIDPrefixFunction(a func(gast.Node) []byte) FootnoteOption {
return &withFootnoteIDPrefixFunction{a}
}
const optFootnoteLinkTitle renderer.OptionName = "FootnoteLinkTitle"
type withFootnoteLinkTitle struct {
value []byte
}
func (o *withFootnoteLinkTitle) SetConfig(c *renderer.Config) {
c.Options[optFootnoteLinkTitle] = o.value
}
func (o *withFootnoteLinkTitle) SetFootnoteOption(c *FootnoteConfig) {
c.LinkTitle = o.value
}
// WithFootnoteLinkTitle is a functional option that is an optional title attribute for footnote links.
func WithFootnoteLinkTitle[T []byte | string](a T) FootnoteOption {
return &withFootnoteLinkTitle{[]byte(a)}
}
const optFootnoteBacklinkTitle renderer.OptionName = "FootnoteBacklinkTitle"
type withFootnoteBacklinkTitle struct {
value []byte
}
func (o *withFootnoteBacklinkTitle) SetConfig(c *renderer.Config) {
c.Options[optFootnoteBacklinkTitle] = o.value
}
func (o *withFootnoteBacklinkTitle) SetFootnoteOption(c *FootnoteConfig) {
c.BacklinkTitle = o.value
}
// WithFootnoteBacklinkTitle is a functional option that is an optional title attribute for footnote backlinks.
func WithFootnoteBacklinkTitle[T []byte | string](a T) FootnoteOption {
return &withFootnoteBacklinkTitle{[]byte(a)}
}
const optFootnoteLinkClass renderer.OptionName = "FootnoteLinkClass"
type withFootnoteLinkClass struct {
value []byte
}
func (o *withFootnoteLinkClass) SetConfig(c *renderer.Config) {
c.Options[optFootnoteLinkClass] = o.value
}
func (o *withFootnoteLinkClass) SetFootnoteOption(c *FootnoteConfig) {
c.LinkClass = o.value
}
// WithFootnoteLinkClass is a functional option that is a class for footnote links.
func WithFootnoteLinkClass[T []byte | string](a T) FootnoteOption {
return &withFootnoteLinkClass{[]byte(a)}
}
const optFootnoteBacklinkClass renderer.OptionName = "FootnoteBacklinkClass"
type withFootnoteBacklinkClass struct {
value []byte
}
func (o *withFootnoteBacklinkClass) SetConfig(c *renderer.Config) {
c.Options[optFootnoteBacklinkClass] = o.value
}
func (o *withFootnoteBacklinkClass) SetFootnoteOption(c *FootnoteConfig) {
c.BacklinkClass = o.value
}
// WithFootnoteBacklinkClass is a functional option that is a class for footnote backlinks.
func WithFootnoteBacklinkClass[T []byte | string](a T) FootnoteOption {
return &withFootnoteBacklinkClass{[]byte(a)}
}
const optFootnoteBacklinkHTML renderer.OptionName = "FootnoteBacklinkHTML"
type withFootnoteBacklinkHTML struct {
value []byte
}
func (o *withFootnoteBacklinkHTML) SetConfig(c *renderer.Config) {
c.Options[optFootnoteBacklinkHTML] = o.value
}
func (o *withFootnoteBacklinkHTML) SetFootnoteOption(c *FootnoteConfig) {
c.BacklinkHTML = o.value
}
// WithFootnoteBacklinkHTML is an HTML content for footnote backlinks.
func WithFootnoteBacklinkHTML[T []byte | string](a T) FootnoteOption {
return &withFootnoteBacklinkHTML{[]byte(a)}
}
// FootnoteHTMLRenderer is a renderer.NodeRenderer implementation that
// renders FootnoteLink nodes.
type FootnoteHTMLRenderer struct {
html.Config
FootnoteConfig
}
// NewFootnoteHTMLRenderer returns a new FootnoteHTMLRenderer.
func NewFootnoteHTMLRenderer(opts ...html.Option) renderer.NodeRenderer {
func NewFootnoteHTMLRenderer(opts ...FootnoteOption) renderer.NodeRenderer {
r := &FootnoteHTMLRenderer{
Config: html.NewConfig(),
FootnoteConfig: NewFootnoteConfig(),
}
for _, opt := range opts {
opt.SetHTMLOption(&r.Config)
opt.SetFootnoteOption(&r.FootnoteConfig)
}
return r
}
@ -203,48 +520,97 @@ func NewFootnoteHTMLRenderer(opts ...html.Option) renderer.NodeRenderer {
// RegisterFuncs implements renderer.NodeRenderer.RegisterFuncs.
func (r *FootnoteHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFuncRegisterer) {
reg.Register(ast.KindFootnoteLink, r.renderFootnoteLink)
reg.Register(ast.KindFootnoteBacklink, r.renderFootnoteBacklink)
reg.Register(ast.KindFootnote, r.renderFootnote)
reg.Register(ast.KindFootnoteList, r.renderFootnoteList)
}
func (r *FootnoteHTMLRenderer) renderFootnoteLink(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
func (r *FootnoteHTMLRenderer) renderFootnoteLink(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
n := node.(*ast.FootnoteLink)
is := strconv.Itoa(n.Index)
_, _ = w.WriteString(`<sup id="fnref:`)
_, _ = w.WriteString(`<sup id="`)
_, _ = w.Write(r.idPrefix(node))
_, _ = w.WriteString(`fnref`)
if n.RefIndex > 0 {
_, _ = w.WriteString(fmt.Sprintf("%v", n.RefIndex))
}
_ = w.WriteByte(':')
_, _ = w.WriteString(is)
_, _ = w.WriteString(`"><a href="#fn:`)
_, _ = w.WriteString(`"><a href="#`)
_, _ = w.Write(r.idPrefix(node))
_, _ = w.WriteString(`fn:`)
_, _ = w.WriteString(is)
_, _ = w.WriteString(`" class="footnote-ref" role="doc-noteref">`)
_, _ = w.WriteString(`" class="`)
_, _ = w.Write(applyFootnoteTemplate(r.FootnoteConfig.LinkClass,
n.Index, n.RefCount))
if len(r.FootnoteConfig.LinkTitle) > 0 {
_, _ = w.WriteString(`" title="`)
_, _ = w.Write(util.EscapeHTML(applyFootnoteTemplate(r.FootnoteConfig.LinkTitle, n.Index, n.RefCount)))
}
_, _ = w.WriteString(`" role="doc-noteref">`)
_, _ = w.WriteString(is)
_, _ = w.WriteString(`</a></sup>`)
}
return gast.WalkContinue, nil
}
func (r *FootnoteHTMLRenderer) renderFootnote(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
func (r *FootnoteHTMLRenderer) renderFootnoteBacklink(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
n := node.(*ast.FootnoteBacklink)
is := strconv.Itoa(n.Index)
_, _ = w.WriteString(`&#160;<a href="#`)
_, _ = w.Write(r.idPrefix(node))
_, _ = w.WriteString(`fnref`)
if n.RefIndex > 0 {
_, _ = w.WriteString(fmt.Sprintf("%v", n.RefIndex))
}
_ = w.WriteByte(':')
_, _ = w.WriteString(is)
_, _ = w.WriteString(`" class="`)
_, _ = w.Write(applyFootnoteTemplate(r.FootnoteConfig.BacklinkClass, n.Index, n.RefCount))
if len(r.FootnoteConfig.BacklinkTitle) > 0 {
_, _ = w.WriteString(`" title="`)
_, _ = w.Write(util.EscapeHTML(applyFootnoteTemplate(r.FootnoteConfig.BacklinkTitle, n.Index, n.RefCount)))
}
_, _ = w.WriteString(`" role="doc-backlink">`)
_, _ = w.Write(applyFootnoteTemplate(r.FootnoteConfig.BacklinkHTML, n.Index, n.RefCount))
_, _ = w.WriteString(`</a>`)
}
return gast.WalkContinue, nil
}
func (r *FootnoteHTMLRenderer) renderFootnote(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
n := node.(*ast.Footnote)
is := strconv.Itoa(n.Index)
if entering {
_, _ = w.WriteString(`<li id="fn:`)
_, _ = w.WriteString(`<li id="`)
_, _ = w.Write(r.idPrefix(node))
_, _ = w.WriteString(`fn:`)
_, _ = w.WriteString(is)
_, _ = w.WriteString(`" role="doc-endnote">`)
_, _ = w.WriteString("\n")
_, _ = w.WriteString(`"`)
if node.Attributes() != nil {
html.RenderAttributes(w, node, html.ListItemAttributeFilter)
}
_, _ = w.WriteString(">\n")
} else {
_, _ = w.WriteString("</li>\n")
}
return gast.WalkContinue, nil
}
func (r *FootnoteHTMLRenderer) renderFootnoteList(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
tag := "section"
if r.Config.XHTML {
tag = "div"
}
func (r *FootnoteHTMLRenderer) renderFootnoteList(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<")
_, _ = w.WriteString(tag)
_, _ = w.WriteString(` class="footnotes" role="doc-endnotes">`)
_, _ = w.WriteString(`<div class="footnotes" role="doc-endnotes"`)
if node.Attributes() != nil {
html.RenderAttributes(w, node, html.GlobalAttributeFilter)
}
_ = w.WriteByte('>')
if r.Config.XHTML {
_, _ = w.WriteString("\n<hr />\n")
} else {
@ -253,18 +619,59 @@ func (r *FootnoteHTMLRenderer) renderFootnoteList(w util.BufWriter, source []byt
_, _ = w.WriteString("<ol>\n")
} else {
_, _ = w.WriteString("</ol>\n")
_, _ = w.WriteString("</")
_, _ = w.WriteString(tag)
_, _ = w.WriteString(">\n")
_, _ = w.WriteString("</div>\n")
}
return gast.WalkContinue, nil
}
func (r *FootnoteHTMLRenderer) idPrefix(node gast.Node) []byte {
if r.FootnoteConfig.IDPrefix != nil {
return r.FootnoteConfig.IDPrefix
}
if r.FootnoteConfig.IDPrefixFunction != nil {
return r.FootnoteConfig.IDPrefixFunction(node)
}
return []byte("")
}
func applyFootnoteTemplate(b []byte, index, refCount int) []byte {
fast := true
for i, c := range b {
if i != 0 {
if b[i-1] == '^' && c == '^' {
fast = false
break
}
if b[i-1] == '%' && c == '%' {
fast = false
break
}
}
}
if fast {
return b
}
is := []byte(strconv.Itoa(index))
rs := []byte(strconv.Itoa(refCount))
ret := bytes.Replace(b, []byte("^^"), is, -1)
return bytes.Replace(ret, []byte("%%"), rs, -1)
}
type footnote struct {
options []FootnoteOption
}
// Footnote is an extension that allow you to use PHP Markdown Extra Footnotes.
var Footnote = &footnote{}
var Footnote = &footnote{
options: []FootnoteOption{},
}
// NewFootnote returns a new extension with given options.
func NewFootnote(opts ...FootnoteOption) goldmark.Extender {
return &footnote{
options: opts,
}
}
func (e *footnote) Extend(m goldmark.Markdown) {
m.Parser().AddOptions(
@ -279,6 +686,6 @@ func (e *footnote) Extend(m goldmark.Markdown) {
),
)
m.Renderer().AddOptions(renderer.WithNodeRenderers(
util.Prioritized(NewFootnoteHTMLRenderer(), 500),
util.Prioritized(NewFootnoteHTMLRenderer(e.options...), 500),
))
}

View file

@ -4,8 +4,12 @@ import (
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
gast "github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
)
func TestFootnote(t *testing.T) {
@ -17,5 +21,121 @@ func TestFootnote(t *testing.T) {
Footnote,
),
)
testutil.DoTestCaseFile(markdown, "_test/footnote.txt", t)
testutil.DoTestCaseFile(markdown, "_test/footnote.txt", t, testutil.ParseCliCaseArg()...)
}
type footnoteID struct {
}
func (a *footnoteID) Transform(node *gast.Document, reader text.Reader, pc parser.Context) {
node.Meta()["footnote-prefix"] = "article12-"
}
func TestFootnoteOptions(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewFootnote(
WithFootnoteIDPrefix("article12-"),
WithFootnoteLinkClass("link-class"),
WithFootnoteBacklinkClass("backlink-class"),
WithFootnoteLinkTitle("link-title-%%-^^"),
WithFootnoteBacklinkTitle("backlink-title"),
WithFootnoteBacklinkHTML("^"),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Description: "Footnote with options",
Markdown: `That's some text with a footnote.[^1]
Same footnote.[^1]
Another one.[^2]
[^1]: And that's the footnote.
[^2]: Another footnote.
`,
Expected: `<p>That's some text with a footnote.<sup id="article12-fnref:1"><a href="#article12-fn:1" class="link-class" title="link-title-2-1" role="doc-noteref">1</a></sup></p>
<p>Same footnote.<sup id="article12-fnref1:1"><a href="#article12-fn:1" class="link-class" title="link-title-2-1" role="doc-noteref">1</a></sup></p>
<p>Another one.<sup id="article12-fnref:2"><a href="#article12-fn:2" class="link-class" title="link-title-1-2" role="doc-noteref">2</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="article12-fn:1">
<p>And that's the footnote.&#160;<a href="#article12-fnref:1" class="backlink-class" title="backlink-title" role="doc-backlink">^</a>&#160;<a href="#article12-fnref1:1" class="backlink-class" title="backlink-title" role="doc-backlink">^</a></p>
</li>
<li id="article12-fn:2">
<p>Another footnote.&#160;<a href="#article12-fnref:2" class="backlink-class" title="backlink-title" role="doc-backlink">^</a></p>
</li>
</ol>
</div>`,
},
t,
)
markdown = goldmark.New(
goldmark.WithParserOptions(
parser.WithASTTransformers(
util.Prioritized(&footnoteID{}, 100),
),
),
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewFootnote(
WithFootnoteIDPrefixFunction(func(n gast.Node) []byte {
v, ok := n.OwnerDocument().Meta()["footnote-prefix"]
if ok {
return util.StringToReadOnlyBytes(v.(string))
}
return nil
}),
WithFootnoteLinkClass([]byte("link-class")),
WithFootnoteBacklinkClass([]byte("backlink-class")),
WithFootnoteLinkTitle([]byte("link-title-%%-^^")),
WithFootnoteBacklinkTitle([]byte("backlink-title")),
WithFootnoteBacklinkHTML([]byte("^")),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 2,
Description: "Footnote with an id prefix function",
Markdown: `That's some text with a footnote.[^1]
Same footnote.[^1]
Another one.[^2]
[^1]: And that's the footnote.
[^2]: Another footnote.
`,
Expected: `<p>That's some text with a footnote.<sup id="article12-fnref:1"><a href="#article12-fn:1" class="link-class" title="link-title-2-1" role="doc-noteref">1</a></sup></p>
<p>Same footnote.<sup id="article12-fnref1:1"><a href="#article12-fn:1" class="link-class" title="link-title-2-1" role="doc-noteref">1</a></sup></p>
<p>Another one.<sup id="article12-fnref:2"><a href="#article12-fn:2" class="link-class" title="link-title-1-2" role="doc-noteref">2</a></sup></p>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="article12-fn:1">
<p>And that's the footnote.&#160;<a href="#article12-fnref:1" class="backlink-class" title="backlink-title" role="doc-backlink">^</a>&#160;<a href="#article12-fnref1:1" class="backlink-class" title="backlink-title" role="doc-backlink">^</a></p>
</li>
<li id="article12-fn:2">
<p>Another footnote.&#160;<a href="#article12-fnref:2" class="backlink-class" title="backlink-title" role="doc-backlink">^</a></p>
</li>
</ol>
</div>`,
},
t,
)
}

View file

@ -2,27 +2,157 @@ package extension
import (
"bytes"
"regexp"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"regexp"
)
var wwwURLRegxp = regexp.MustCompile(`^www\.[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b(?:[-a-zA-Z0-9@:%_\+.~#?&//=\(\);]*)`)
var wwwURLRegxp = regexp.MustCompile(`^www\.[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-z]+(?:[/#?][-a-zA-Z0-9@:%_\+.~#!?&/=\(\);,'">\^{}\[\]` + "`" + `]*)?`) //nolint:golint,lll
var urlRegexp = regexp.MustCompile(`^(?:http|https|ftp):\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&//=\(\);]*)`)
var urlRegexp = regexp.MustCompile(`^(?:http|https|ftp)://[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-z]+(?::\d+)?(?:[/#?][-a-zA-Z0-9@:%_+.~#$!?&/=\(\);,'">\^{}\[\]` + "`" + `]*)?`) //nolint:golint,lll
type linkifyParser struct {
// An LinkifyConfig struct is a data structure that holds configuration of the
// Linkify extension.
type LinkifyConfig struct {
AllowedProtocols [][]byte
URLRegexp *regexp.Regexp
WWWRegexp *regexp.Regexp
EmailRegexp *regexp.Regexp
}
var defaultLinkifyParser = &linkifyParser{}
const (
optLinkifyAllowedProtocols parser.OptionName = "LinkifyAllowedProtocols"
optLinkifyURLRegexp parser.OptionName = "LinkifyURLRegexp"
optLinkifyWWWRegexp parser.OptionName = "LinkifyWWWRegexp"
optLinkifyEmailRegexp parser.OptionName = "LinkifyEmailRegexp"
)
// SetOption implements SetOptioner.
func (c *LinkifyConfig) SetOption(name parser.OptionName, value interface{}) {
switch name {
case optLinkifyAllowedProtocols:
c.AllowedProtocols = value.([][]byte)
case optLinkifyURLRegexp:
c.URLRegexp = value.(*regexp.Regexp)
case optLinkifyWWWRegexp:
c.WWWRegexp = value.(*regexp.Regexp)
case optLinkifyEmailRegexp:
c.EmailRegexp = value.(*regexp.Regexp)
}
}
// A LinkifyOption interface sets options for the LinkifyOption.
type LinkifyOption interface {
parser.Option
SetLinkifyOption(*LinkifyConfig)
}
type withLinkifyAllowedProtocols struct {
value [][]byte
}
func (o *withLinkifyAllowedProtocols) SetParserOption(c *parser.Config) {
c.Options[optLinkifyAllowedProtocols] = o.value
}
func (o *withLinkifyAllowedProtocols) SetLinkifyOption(p *LinkifyConfig) {
p.AllowedProtocols = o.value
}
// WithLinkifyAllowedProtocols is a functional option that specify allowed
// protocols in autolinks. Each protocol must end with ':' like
// 'http:' .
func WithLinkifyAllowedProtocols[T []byte | string](value []T) LinkifyOption {
opt := &withLinkifyAllowedProtocols{}
for _, v := range value {
opt.value = append(opt.value, []byte(v))
}
return opt
}
type withLinkifyURLRegexp struct {
value *regexp.Regexp
}
func (o *withLinkifyURLRegexp) SetParserOption(c *parser.Config) {
c.Options[optLinkifyURLRegexp] = o.value
}
func (o *withLinkifyURLRegexp) SetLinkifyOption(p *LinkifyConfig) {
p.URLRegexp = o.value
}
// WithLinkifyURLRegexp is a functional option that specify
// a pattern of the URL including a protocol.
func WithLinkifyURLRegexp(value *regexp.Regexp) LinkifyOption {
return &withLinkifyURLRegexp{
value: value,
}
}
type withLinkifyWWWRegexp struct {
value *regexp.Regexp
}
func (o *withLinkifyWWWRegexp) SetParserOption(c *parser.Config) {
c.Options[optLinkifyWWWRegexp] = o.value
}
func (o *withLinkifyWWWRegexp) SetLinkifyOption(p *LinkifyConfig) {
p.WWWRegexp = o.value
}
// WithLinkifyWWWRegexp is a functional option that specify
// a pattern of the URL without a protocol.
// This pattern must start with 'www.' .
func WithLinkifyWWWRegexp(value *regexp.Regexp) LinkifyOption {
return &withLinkifyWWWRegexp{
value: value,
}
}
type withLinkifyEmailRegexp struct {
value *regexp.Regexp
}
func (o *withLinkifyEmailRegexp) SetParserOption(c *parser.Config) {
c.Options[optLinkifyEmailRegexp] = o.value
}
func (o *withLinkifyEmailRegexp) SetLinkifyOption(p *LinkifyConfig) {
p.EmailRegexp = o.value
}
// WithLinkifyEmailRegexp is a functional otpion that specify
// a pattern of the email address.
func WithLinkifyEmailRegexp(value *regexp.Regexp) LinkifyOption {
return &withLinkifyEmailRegexp{
value: value,
}
}
type linkifyParser struct {
LinkifyConfig
}
// NewLinkifyParser return a new InlineParser can parse
// text that seems like a URL.
func NewLinkifyParser() parser.InlineParser {
return defaultLinkifyParser
func NewLinkifyParser(opts ...LinkifyOption) parser.InlineParser {
p := &linkifyParser{
LinkifyConfig: LinkifyConfig{
AllowedProtocols: nil,
URLRegexp: urlRegexp,
WWWRegexp: wwwURLRegxp,
},
}
for _, o := range opts {
o.SetLinkifyOption(&p.LinkifyConfig)
}
return p
}
func (s *linkifyParser) Trigger() []byte {
@ -30,12 +160,17 @@ func (s *linkifyParser) Trigger() []byte {
return []byte{' ', '*', '_', '~', '('}
}
var protoHTTP = []byte("http:")
var protoHTTPS = []byte("https:")
var protoFTP = []byte("ftp:")
var domainWWW = []byte("www.")
var (
protoHTTP = []byte("http:")
protoHTTPS = []byte("https:")
protoFTP = []byte("ftp:")
domainWWW = []byte("www.")
)
func (s *linkifyParser) Parse(parent ast.Node, block text.Reader, pc parser.Context) ast.Node {
if pc.IsInLinkLabel() {
return nil
}
line, segment := block.PeekLine()
consumes := 0
start := segment.Start
@ -50,14 +185,26 @@ func (s *linkifyParser) Parse(parent ast.Node, block text.Reader, pc parser.Cont
var m []int
var protocol []byte
var typ ast.AutoLinkType = ast.AutoLinkURL
if bytes.HasPrefix(line, protoHTTP) || bytes.HasPrefix(line, protoHTTPS) || bytes.HasPrefix(line, protoFTP) {
m = urlRegexp.FindSubmatchIndex(line)
if s.LinkifyConfig.AllowedProtocols == nil {
if bytes.HasPrefix(line, protoHTTP) || bytes.HasPrefix(line, protoHTTPS) || bytes.HasPrefix(line, protoFTP) {
m = s.LinkifyConfig.URLRegexp.FindSubmatchIndex(line)
}
} else {
for _, prefix := range s.LinkifyConfig.AllowedProtocols {
if bytes.HasPrefix(line, prefix) {
m = s.LinkifyConfig.URLRegexp.FindSubmatchIndex(line)
break
}
}
}
if m == nil && bytes.HasPrefix(line, domainWWW) {
m = wwwURLRegxp.FindSubmatchIndex(line)
m = s.LinkifyConfig.WWWRegexp.FindSubmatchIndex(line)
protocol = []byte("http")
}
if m != nil {
if m != nil && m[0] != 0 {
m = nil
}
if m != nil && m[0] == 0 {
lastChar := line[m[1]-1]
if lastChar == '.' {
m[1]--
@ -89,8 +236,19 @@ func (s *linkifyParser) Parse(parent ast.Node, block text.Reader, pc parser.Cont
}
}
if m == nil {
if len(line) > 0 && util.IsPunct(line[0]) {
return nil
}
typ = ast.AutoLinkEmail
stop := util.FindEmailIndex(line)
stop := -1
if s.LinkifyConfig.EmailRegexp == nil {
stop = util.FindEmailIndex(line)
} else {
m := s.LinkifyConfig.EmailRegexp.FindSubmatchIndex(line)
if m != nil && m[0] == 0 {
stop = m[1]
}
}
if stop < 0 {
return nil
}
@ -117,9 +275,20 @@ func (s *linkifyParser) Parse(parent ast.Node, block text.Reader, pc parser.Cont
s := segment.WithStop(segment.Start + 1)
ast.MergeOrAppendTextSegment(parent, s)
}
consumes += m[1]
i := m[1] - 1
for ; i > 0; i-- {
c := line[i]
switch c {
case '?', '!', '.', ',', ':', '*', '_', '~':
default:
goto endfor
}
}
endfor:
i++
consumes += i
block.Advance(consumes)
n := ast.NewTextSegment(text.NewSegment(start, start+m[1]))
n := ast.NewTextSegment(text.NewSegment(start, start+i))
link := ast.NewAutoLink(typ, n)
link.Protocol = protocol
return link
@ -130,13 +299,24 @@ func (s *linkifyParser) CloseBlock(parent ast.Node, pc parser.Context) {
}
type linkify struct {
options []LinkifyOption
}
// Linkify is an extension that allow you to parse text that seems like a URL.
var Linkify = &linkify{}
func (e *linkify) Extend(m goldmark.Markdown) {
m.Parser().AddOptions(parser.WithInlineParsers(
util.Prioritized(NewLinkifyParser(), 999),
))
// NewLinkify creates a new [goldmark.Extender] that
// allow you to parse text that seems like a URL.
func NewLinkify(opts ...LinkifyOption) goldmark.Extender {
return &linkify{
options: opts,
}
}
func (e *linkify) Extend(m goldmark.Markdown) {
m.Parser().AddOptions(
parser.WithInlineParsers(
util.Prioritized(NewLinkifyParser(e.options...), 999),
),
)
}

View file

@ -1,11 +1,12 @@
package extension
import (
"regexp"
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
func TestLinkify(t *testing.T) {
@ -17,5 +18,83 @@ func TestLinkify(t *testing.T) {
Linkify,
),
)
testutil.DoTestCaseFile(markdown, "_test/linkify.txt", t)
testutil.DoTestCaseFile(markdown, "_test/linkify.txt", t, testutil.ParseCliCaseArg()...)
}
func TestLinkifyWithAllowedProtocols(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewLinkify(
WithLinkifyAllowedProtocols([]string{
"ssh:",
}),
WithLinkifyURLRegexp(
regexp.MustCompile(`\w+://[^\s]+`),
),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Markdown: `hoge ssh://user@hoge.com. http://example.com/`,
Expected: `<p>hoge <a href="ssh://user@hoge.com">ssh://user@hoge.com</a>. http://example.com/</p>`,
},
t,
)
}
func TestLinkifyWithWWWRegexp(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewLinkify(
WithLinkifyWWWRegexp(
regexp.MustCompile(`www\.example\.com`),
),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Markdown: `www.google.com www.example.com`,
Expected: `<p>www.google.com <a href="http://www.example.com">www.example.com</a></p>`,
},
t,
)
}
func TestLinkifyWithEmailRegexp(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewLinkify(
WithLinkifyEmailRegexp(
regexp.MustCompile(`user@example\.com`),
),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Markdown: `hoge@example.com user@example.com`,
Expected: `<p>hoge@example.com <a href="mailto:user@example.com">user@example.com</a></p>`,
},
t,
)
}

2
extension/package.go Normal file
View file

@ -0,0 +1,2 @@
// Package extension is a collection of builtin extensions.
package extension

View file

@ -46,10 +46,11 @@ func (s *strikethroughParser) Trigger() []byte {
func (s *strikethroughParser) Parse(parent gast.Node, block text.Reader, pc parser.Context) gast.Node {
before := block.PrecendingCharacter()
line, segment := block.PeekLine()
node := parser.ScanDelimiter(line, before, 2, defaultStrikethroughDelimiterProcessor)
if node == nil {
node := parser.ScanDelimiter(line, before, 1, defaultStrikethroughDelimiterProcessor)
if node == nil || node.OriginalLength > 2 || before == '~' {
return nil
}
node.Segment = segment.WithStop(segment.Start + node.OriginalLength)
block.Advance(node.OriginalLength)
pc.PushDelimiter(node)
@ -82,11 +83,21 @@ func (r *StrikethroughHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFuncR
reg.Register(ast.KindStrikethrough, r.renderStrikethrough)
}
func (r *StrikethroughHTMLRenderer) renderStrikethrough(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
// StrikethroughAttributeFilter defines attribute names which dd elements can have.
var StrikethroughAttributeFilter = html.GlobalAttributeFilter
func (r *StrikethroughHTMLRenderer) renderStrikethrough(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
w.WriteString("<del>")
if n.Attributes() != nil {
_, _ = w.WriteString("<del")
html.RenderAttributes(w, n, StrikethroughAttributeFilter)
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("<del>")
}
} else {
w.WriteString("</del>")
_, _ = w.WriteString("</del>")
}
return gast.WalkContinue, nil
}

View file

@ -4,8 +4,8 @@ import (
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
func TestStrikethrough(t *testing.T) {
@ -17,5 +17,5 @@ func TestStrikethrough(t *testing.T) {
Strikethrough,
),
)
testutil.DoTestCaseFile(markdown, "_test/strikethrough.txt", t)
testutil.DoTestCaseFile(markdown, "_test/strikethrough.txt", t, testutil.ParseCliCaseArg()...)
}

View file

@ -15,7 +15,124 @@ import (
"github.com/yuin/goldmark/util"
)
var tableDelimRegexp = regexp.MustCompile(`^[\s\-\|\:]+$`)
var escapedPipeCellListKey = parser.NewContextKey()
type escapedPipeCell struct {
Cell *ast.TableCell
Pos []int
Transformed bool
}
// TableCellAlignMethod indicates how are table cells aligned in HTML format.
type TableCellAlignMethod int
const (
// TableCellAlignDefault renders alignments by default method.
// With XHTML, alignments are rendered as an align attribute.
// With HTML5, alignments are rendered as a style attribute.
TableCellAlignDefault TableCellAlignMethod = iota
// TableCellAlignAttribute renders alignments as an align attribute.
TableCellAlignAttribute
// TableCellAlignStyle renders alignments as a style attribute.
TableCellAlignStyle
// TableCellAlignNone does not care about alignments.
// If you using classes or other styles, you can add these attributes
// in an ASTTransformer.
TableCellAlignNone
)
// TableConfig struct holds options for the extension.
type TableConfig struct {
html.Config
// TableCellAlignMethod indicates how are table celss aligned.
TableCellAlignMethod TableCellAlignMethod
}
// TableOption interface is a functional option interface for the extension.
type TableOption interface {
renderer.Option
// SetTableOption sets given option to the extension.
SetTableOption(*TableConfig)
}
// NewTableConfig returns a new Config with defaults.
func NewTableConfig() TableConfig {
return TableConfig{
Config: html.NewConfig(),
TableCellAlignMethod: TableCellAlignDefault,
}
}
// SetOption implements renderer.SetOptioner.
func (c *TableConfig) SetOption(name renderer.OptionName, value interface{}) {
switch name {
case optTableCellAlignMethod:
c.TableCellAlignMethod = value.(TableCellAlignMethod)
default:
c.Config.SetOption(name, value)
}
}
type withTableHTMLOptions struct {
value []html.Option
}
func (o *withTableHTMLOptions) SetConfig(c *renderer.Config) {
if o.value != nil {
for _, v := range o.value {
v.(renderer.Option).SetConfig(c)
}
}
}
func (o *withTableHTMLOptions) SetTableOption(c *TableConfig) {
if o.value != nil {
for _, v := range o.value {
v.SetHTMLOption(&c.Config)
}
}
}
// WithTableHTMLOptions is functional option that wraps goldmark HTMLRenderer options.
func WithTableHTMLOptions(opts ...html.Option) TableOption {
return &withTableHTMLOptions{opts}
}
const optTableCellAlignMethod renderer.OptionName = "TableTableCellAlignMethod"
type withTableCellAlignMethod struct {
value TableCellAlignMethod
}
func (o *withTableCellAlignMethod) SetConfig(c *renderer.Config) {
c.Options[optTableCellAlignMethod] = o.value
}
func (o *withTableCellAlignMethod) SetTableOption(c *TableConfig) {
c.TableCellAlignMethod = o.value
}
// WithTableCellAlignMethod is a functional option that indicates how are table cells aligned in HTML format.
func WithTableCellAlignMethod(a TableCellAlignMethod) TableOption {
return &withTableCellAlignMethod{a}
}
func isTableDelim(bs []byte) bool {
if w, _ := util.IndentWidth(bs, 0); w > 3 {
return false
}
for _, b := range bs {
if !(util.IsSpace(b) || b == '-' || b == '|' || b == ':') {
return false
}
}
return true
}
var tableDelimLeft = regexp.MustCompile(`^\s*\:\-+\s*$`)
var tableDelimRight = regexp.MustCompile(`^\s*\-+\:\s*$`)
var tableDelimCenter = regexp.MustCompile(`^\s*\:\-+\:\s*$`)
@ -27,7 +144,7 @@ type tableParagraphTransformer struct {
var defaultTableParagraphTransformer = &tableParagraphTransformer{}
// NewTableParagraphTransformer returns a new ParagraphTransformer
// that can transform pargraphs into tables.
// that can transform paragraphs into tables.
func NewTableParagraphTransformer() parser.ParagraphTransformer {
return defaultTableParagraphTransformer
}
@ -37,31 +154,41 @@ func (b *tableParagraphTransformer) Transform(node *gast.Paragraph, reader text.
if lines.Len() < 2 {
return
}
alignments := b.parseDelimiter(lines.At(1), reader)
if alignments == nil {
return
for i := 1; i < lines.Len(); i++ {
alignments := b.parseDelimiter(lines.At(i), reader)
if alignments == nil {
continue
}
header := b.parseRow(lines.At(i-1), alignments, true, reader, pc)
if header == nil || len(alignments) != header.ChildCount() {
return
}
table := ast.NewTable()
table.Alignments = alignments
table.AppendChild(table, ast.NewTableHeader(header))
for j := i + 1; j < lines.Len(); j++ {
table.AppendChild(table, b.parseRow(lines.At(j), alignments, false, reader, pc))
}
node.Lines().SetSliced(0, i-1)
node.Parent().InsertAfter(node.Parent(), node, table)
if node.Lines().Len() == 0 {
node.Parent().RemoveChild(node.Parent(), node)
} else {
last := node.Lines().At(i - 2)
last.Stop = last.Stop - 1 // trim last newline(\n)
node.Lines().Set(i-2, last)
}
}
header := b.parseRow(lines.At(0), alignments, true, reader)
if header == nil || len(alignments) != header.ChildCount() {
return
}
table := ast.NewTable()
table.Alignments = alignments
table.AppendChild(table, ast.NewTableHeader(header))
for i := 2; i < lines.Len(); i++ {
table.AppendChild(table, b.parseRow(lines.At(i), alignments, false, reader))
}
node.Parent().InsertBefore(node.Parent(), node, table)
node.Parent().RemoveChild(node.Parent(), node)
}
func (b *tableParagraphTransformer) parseRow(segment text.Segment, alignments []ast.Alignment, isHeader bool, reader text.Reader) *ast.TableRow {
func (b *tableParagraphTransformer) parseRow(segment text.Segment,
alignments []ast.Alignment, isHeader bool, reader text.Reader, pc parser.Context) *ast.TableRow {
source := reader.Source()
segment = segment.TrimLeftSpace(source)
segment = segment.TrimRightSpace(source)
line := segment.Value(source)
pos := 0
pos += util.TrimLeftSpaceLength(line)
limit := len(line)
limit -= util.TrimRightSpaceLength(line)
row := ast.NewTableRow(alignments)
if len(line) > 0 && line[pos] == '|' {
pos++
@ -79,18 +206,39 @@ func (b *tableParagraphTransformer) parseRow(segment text.Segment, alignments []
} else {
alignment = alignments[i]
}
closure := util.FindClosure(line[pos:], byte(0), '|', true, false)
if closure < 0 {
closure = len(line[pos:])
}
var escapedCell *escapedPipeCell
node := ast.NewTableCell()
segment := text.NewSegment(segment.Start+pos, segment.Start+pos+closure)
segment = segment.TrimLeftSpace(source)
segment = segment.TrimRightSpace(source)
node.Lines().Append(segment)
node.Alignment = alignment
hasBacktick := false
closure := pos
for ; closure < limit; closure++ {
if line[closure] == '`' {
hasBacktick = true
}
if line[closure] == '|' {
if closure == 0 || line[closure-1] != '\\' {
break
} else if hasBacktick {
if escapedCell == nil {
escapedCell = &escapedPipeCell{node, []int{}, false}
escapedList := pc.ComputeIfAbsent(escapedPipeCellListKey,
func() interface{} {
return []*escapedPipeCell{}
}).([]*escapedPipeCell)
escapedList = append(escapedList, escapedCell)
pc.Set(escapedPipeCellListKey, escapedList)
}
escapedCell.Pos = append(escapedCell.Pos, segment.Start+closure-1)
}
}
}
seg := text.NewSegment(segment.Start+pos, segment.Start+closure)
seg = seg.TrimLeftSpace(source)
seg = seg.TrimRightSpace(source)
node.Lines().Append(seg)
row.AppendChild(row, node)
pos += closure + 1
pos = closure + 1
}
for ; i < len(alignments); i++ {
row.AppendChild(row, ast.NewTableCell())
@ -99,8 +247,9 @@ func (b *tableParagraphTransformer) parseRow(segment text.Segment, alignments []
}
func (b *tableParagraphTransformer) parseDelimiter(segment text.Segment, reader text.Reader) []ast.Alignment {
line := segment.Value(reader.Source())
if !tableDelimRegexp.Match(line) {
if !isTableDelim(line) {
return nil
}
cols := bytes.Split(line, []byte{'|'})
@ -128,19 +277,74 @@ func (b *tableParagraphTransformer) parseDelimiter(segment text.Segment, reader
return alignments
}
type tableASTTransformer struct {
}
var defaultTableASTTransformer = &tableASTTransformer{}
// NewTableASTTransformer returns a parser.ASTTransformer for tables.
func NewTableASTTransformer() parser.ASTTransformer {
return defaultTableASTTransformer
}
func (a *tableASTTransformer) Transform(node *gast.Document, reader text.Reader, pc parser.Context) {
lst := pc.Get(escapedPipeCellListKey)
if lst == nil {
return
}
pc.Set(escapedPipeCellListKey, nil)
for _, v := range lst.([]*escapedPipeCell) {
if v.Transformed {
continue
}
_ = gast.Walk(v.Cell, func(n gast.Node, entering bool) (gast.WalkStatus, error) {
if !entering || n.Kind() != gast.KindCodeSpan {
return gast.WalkContinue, nil
}
for c := n.FirstChild(); c != nil; {
next := c.NextSibling()
if c.Kind() != gast.KindText {
c = next
continue
}
parent := c.Parent()
ts := &c.(*gast.Text).Segment
n := c
for _, v := range lst.([]*escapedPipeCell) {
for _, pos := range v.Pos {
if ts.Start <= pos && pos < ts.Stop {
segment := n.(*gast.Text).Segment
n1 := gast.NewRawTextSegment(segment.WithStop(pos))
n2 := gast.NewRawTextSegment(segment.WithStart(pos + 1))
parent.InsertAfter(parent, n, n1)
parent.InsertAfter(parent, n1, n2)
parent.RemoveChild(parent, n)
n = n2
v.Transformed = true
}
}
}
c = next
}
return gast.WalkContinue, nil
})
}
}
// TableHTMLRenderer is a renderer.NodeRenderer implementation that
// renders Table nodes.
type TableHTMLRenderer struct {
html.Config
TableConfig
}
// NewTableHTMLRenderer returns a new TableHTMLRenderer.
func NewTableHTMLRenderer(opts ...html.Option) renderer.NodeRenderer {
func NewTableHTMLRenderer(opts ...TableOption) renderer.NodeRenderer {
r := &TableHTMLRenderer{
Config: html.NewConfig(),
TableConfig: NewTableConfig(),
}
for _, opt := range opts {
opt.SetHTMLOption(&r.Config)
opt.SetTableOption(&r.TableConfig)
}
return r
}
@ -153,19 +357,51 @@ func (r *TableHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFuncRegistere
reg.Register(ast.KindTableCell, r.renderTableCell)
}
func (r *TableHTMLRenderer) renderTable(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
// TableAttributeFilter defines attribute names which table elements can have.
var TableAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("align"), // [Deprecated]
[]byte("bgcolor"), // [Deprecated]
[]byte("border"), // [Deprecated]
[]byte("cellpadding"), // [Deprecated]
[]byte("cellspacing"), // [Deprecated]
[]byte("frame"), // [Deprecated]
[]byte("rules"), // [Deprecated]
[]byte("summary"), // [Deprecated]
[]byte("width"), // [Deprecated]
)
func (r *TableHTMLRenderer) renderTable(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<table>\n")
_, _ = w.WriteString("<table")
if n.Attributes() != nil {
html.RenderAttributes(w, n, TableAttributeFilter)
}
_, _ = w.WriteString(">\n")
} else {
_, _ = w.WriteString("</table>\n")
}
return gast.WalkContinue, nil
}
func (r *TableHTMLRenderer) renderTableHeader(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
// TableHeaderAttributeFilter defines attribute names which <thead> elements can have.
var TableHeaderAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("align"), // [Deprecated since HTML4] [Obsolete since HTML5]
[]byte("bgcolor"), // [Not Standardized]
[]byte("char"), // [Deprecated since HTML4] [Obsolete since HTML5]
[]byte("charoff"), // [Deprecated since HTML4] [Obsolete since HTML5]
[]byte("valign"), // [Deprecated since HTML4] [Obsolete since HTML5]
)
func (r *TableHTMLRenderer) renderTableHeader(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<thead>\n")
_, _ = w.WriteString("<tr>\n")
_, _ = w.WriteString("<thead")
if n.Attributes() != nil {
html.RenderAttributes(w, n, TableHeaderAttributeFilter)
}
_, _ = w.WriteString(">\n")
_, _ = w.WriteString("<tr>\n") // Header <tr> has no separate handle
} else {
_, _ = w.WriteString("</tr>\n")
_, _ = w.WriteString("</thead>\n")
@ -176,9 +412,23 @@ func (r *TableHTMLRenderer) renderTableHeader(w util.BufWriter, source []byte, n
return gast.WalkContinue, nil
}
func (r *TableHTMLRenderer) renderTableRow(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
// TableRowAttributeFilter defines attribute names which <tr> elements can have.
var TableRowAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("align"), // [Obsolete since HTML5]
[]byte("bgcolor"), // [Obsolete since HTML5]
[]byte("char"), // [Obsolete since HTML5]
[]byte("charoff"), // [Obsolete since HTML5]
[]byte("valign"), // [Obsolete since HTML5]
)
func (r *TableHTMLRenderer) renderTableRow(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<tr>\n")
_, _ = w.WriteString("<tr")
if n.Attributes() != nil {
html.RenderAttributes(w, n, TableRowAttributeFilter)
}
_, _ = w.WriteString(">\n")
} else {
_, _ = w.WriteString("</tr>\n")
if n.Parent().LastChild() == n {
@ -188,35 +438,127 @@ func (r *TableHTMLRenderer) renderTableRow(w util.BufWriter, source []byte, n ga
return gast.WalkContinue, nil
}
func (r *TableHTMLRenderer) renderTableCell(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
// TableThCellAttributeFilter defines attribute names which table <th> cells can have.
var TableThCellAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("abbr"), // [OK] Contains a short abbreviated description of the cell's content [NOT OK in <td>]
[]byte("align"), // [Obsolete since HTML5]
[]byte("axis"), // [Obsolete since HTML5]
[]byte("bgcolor"), // [Not Standardized]
[]byte("char"), // [Obsolete since HTML5]
[]byte("charoff"), // [Obsolete since HTML5]
[]byte("colspan"), // [OK] Number of columns that the cell is to span
[]byte("headers"), // [OK] This attribute contains a list of space-separated
// strings, each corresponding to the id attribute of the <th> elements that apply to this element
[]byte("height"), // [Deprecated since HTML4] [Obsolete since HTML5]
[]byte("rowspan"), // [OK] Number of rows that the cell is to span
[]byte("scope"), // [OK] This enumerated attribute defines the cells that
// the header (defined in the <th>) element relates to [NOT OK in <td>]
[]byte("valign"), // [Obsolete since HTML5]
[]byte("width"), // [Deprecated since HTML4] [Obsolete since HTML5]
)
// TableTdCellAttributeFilter defines attribute names which table <td> cells can have.
var TableTdCellAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("abbr"), // [Obsolete since HTML5] [OK in <th>]
[]byte("align"), // [Obsolete since HTML5]
[]byte("axis"), // [Obsolete since HTML5]
[]byte("bgcolor"), // [Not Standardized]
[]byte("char"), // [Obsolete since HTML5]
[]byte("charoff"), // [Obsolete since HTML5]
[]byte("colspan"), // [OK] Number of columns that the cell is to span
[]byte("headers"), // [OK] This attribute contains a list of space-separated
// strings, each corresponding to the id attribute of the <th> elements that apply to this element
[]byte("height"), // [Deprecated since HTML4] [Obsolete since HTML5]
[]byte("rowspan"), // [OK] Number of rows that the cell is to span
[]byte("scope"), // [Obsolete since HTML5] [OK in <th>]
[]byte("valign"), // [Obsolete since HTML5]
[]byte("width"), // [Deprecated since HTML4] [Obsolete since HTML5]
)
func (r *TableHTMLRenderer) renderTableCell(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
n := node.(*ast.TableCell)
tag := "td"
if n.Parent().Kind() == ast.KindTableHeader {
tag = "th"
}
if entering {
align := ""
_, _ = fmt.Fprintf(w, "<%s", tag)
if n.Alignment != ast.AlignNone {
align = fmt.Sprintf(` align="%s"`, n.Alignment.String())
amethod := r.TableConfig.TableCellAlignMethod
if amethod == TableCellAlignDefault {
if r.Config.XHTML {
amethod = TableCellAlignAttribute
} else {
amethod = TableCellAlignStyle
}
}
switch amethod {
case TableCellAlignAttribute:
if _, ok := n.AttributeString("align"); !ok { // Skip align render if overridden
_, _ = fmt.Fprintf(w, ` align="%s"`, n.Alignment.String())
}
case TableCellAlignStyle:
v, ok := n.AttributeString("style")
var cob util.CopyOnWriteBuffer
if ok {
cob = util.NewCopyOnWriteBuffer(v.([]byte))
cob.AppendByte(';')
}
style := fmt.Sprintf("text-align:%s", n.Alignment.String())
cob.AppendString(style)
n.SetAttributeString("style", cob.Bytes())
}
}
fmt.Fprintf(w, "<%s%s>", tag, align)
if n.Attributes() != nil {
if tag == "td" {
html.RenderAttributes(w, n, TableTdCellAttributeFilter) // <td>
} else {
html.RenderAttributes(w, n, TableThCellAttributeFilter) // <th>
}
}
_ = w.WriteByte('>')
} else {
fmt.Fprintf(w, "</%s>\n", tag)
_, _ = fmt.Fprintf(w, "</%s>\n", tag)
}
return gast.WalkContinue, nil
}
type table struct {
options []TableOption
}
// Table is an extension that allow you to use GFM tables .
var Table = &table{}
var Table = &table{
options: []TableOption{},
}
// NewTable returns a new extension with given options.
func NewTable(opts ...TableOption) goldmark.Extender {
return &table{
options: opts,
}
}
func (e *table) Extend(m goldmark.Markdown) {
m.Parser().AddOptions(parser.WithParagraphTransformers(
util.Prioritized(NewTableParagraphTransformer(), 200),
))
m.Parser().AddOptions(
parser.WithParagraphTransformers(
util.Prioritized(NewTableParagraphTransformer(), 200),
),
parser.WithASTTransformers(
util.Prioritized(defaultTableASTTransformer, 0),
),
)
m.Renderer().AddOptions(renderer.WithNodeRenderers(
util.Prioritized(NewTableHTMLRenderer(), 500),
util.Prioritized(NewTableHTMLRenderer(e.options...), 500),
))
}

View file

@ -4,18 +4,391 @@ import (
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/ast"
east "github.com/yuin/goldmark/extension/ast"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
)
func TestTable(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
html.WithXHTML(),
),
goldmark.WithExtensions(
Table,
),
)
testutil.DoTestCaseFile(markdown, "_test/table.txt", t)
testutil.DoTestCaseFile(markdown, "_test/table.txt", t, testutil.ParseCliCaseArg()...)
}
func TestTableWithAlignDefault(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignDefault),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Description: "Cell with TableCellAlignDefault and XHTML should be rendered as an align attribute",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th align="center">abc</th>
<th align="right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">bar</td>
<td align="right">baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
markdown = goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignDefault),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 2,
Description: "Cell with TableCellAlignDefault and HTML5 should be rendered as a style attribute",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th style="text-align:center">abc</th>
<th style="text-align:right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:center">bar</td>
<td style="text-align:right">baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
}
func TestTableWithAlignAttribute(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignAttribute),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Description: "Cell with TableCellAlignAttribute and XHTML should be rendered as an align attribute",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th align="center">abc</th>
<th align="right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">bar</td>
<td align="right">baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
markdown = goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignAttribute),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 2,
Description: "Cell with TableCellAlignAttribute and HTML5 should be rendered as an align attribute",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th align="center">abc</th>
<th align="right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">bar</td>
<td align="right">baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
}
type tableStyleTransformer struct {
}
func (a *tableStyleTransformer) Transform(node *ast.Document, reader text.Reader, pc parser.Context) {
cell := node.FirstChild().FirstChild().FirstChild().(*east.TableCell)
cell.SetAttributeString("style", []byte("font-size:1em"))
}
func TestTableWithAlignStyle(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignStyle),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Description: "Cell with TableCellAlignStyle and XHTML should be rendered as a style attribute",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th style="text-align:center">abc</th>
<th style="text-align:right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:center">bar</td>
<td style="text-align:right">baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
markdown = goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignStyle),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 2,
Description: "Cell with TableCellAlignStyle and HTML5 should be rendered as a style attribute",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th style="text-align:center">abc</th>
<th style="text-align:right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:center">bar</td>
<td style="text-align:right">baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
markdown = goldmark.New(
goldmark.WithParserOptions(
parser.WithASTTransformers(
util.Prioritized(&tableStyleTransformer{}, 0),
),
),
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignStyle),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 3,
Description: "Styled cell should not be broken the style by the alignments",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th style="font-size:1em;text-align:center">abc</th>
<th style="text-align:right">defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:center">bar</td>
<td style="text-align:right">baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
}
func TestTableWithAlignNone(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(
WithTableCellAlignMethod(TableCellAlignNone),
),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Description: "Cell with TableCellAlignStyle and XHTML should not be rendered",
Markdown: `
| abc | defghi |
:-: | -----------:
bar | baz
`,
Expected: `<table>
<thead>
<tr>
<th>abc</th>
<th>defghi</th>
</tr>
</thead>
<tbody>
<tr>
<td>bar</td>
<td>baz</td>
</tr>
</tbody>
</table>`,
},
t,
)
}
func TestTableFuzzedPanics(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Description: "This should not panic",
Markdown: "* 0\n-|\n\t0",
Expected: `<ul>
<li>
<table>
<thead>
<tr>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
</tr>
</tbody>
</table>
</li>
</ul>`,
},
t,
)
}

View file

@ -1,6 +1,8 @@
package extension
import (
"regexp"
"github.com/yuin/goldmark"
gast "github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/extension/ast"
@ -9,7 +11,6 @@ import (
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"regexp"
)
var taskListRegexp = regexp.MustCompile(`^\[([\sxX])\]\s*`)
@ -40,6 +41,9 @@ func (s *taskCheckBoxParser) Parse(parent gast.Node, block text.Reader, pc parse
return nil
}
if parent.HasChildren() {
return nil
}
if _, ok := parent.Parent().(*gast.ListItem); !ok {
return nil
}
@ -80,21 +84,22 @@ func (r *TaskCheckBoxHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFuncRe
reg.Register(ast.KindTaskCheckBox, r.renderTaskCheckBox)
}
func (r *TaskCheckBoxHTMLRenderer) renderTaskCheckBox(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
func (r *TaskCheckBoxHTMLRenderer) renderTaskCheckBox(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if !entering {
return gast.WalkContinue, nil
}
n := node.(*ast.TaskCheckBox)
if n.IsChecked {
w.WriteString(`<input checked="" disabled="" type="checkbox"`)
_, _ = w.WriteString(`<input checked="" disabled="" type="checkbox"`)
} else {
w.WriteString(`<input disabled="" type="checkbox"`)
_, _ = w.WriteString(`<input disabled="" type="checkbox"`)
}
if r.XHTML {
w.WriteString(" />")
_, _ = w.WriteString(" /> ")
} else {
w.WriteString(">")
_, _ = w.WriteString("> ")
}
return gast.WalkContinue, nil
}

View file

@ -4,8 +4,8 @@ import (
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
func TestTaskList(t *testing.T) {
@ -17,5 +17,5 @@ func TestTaskList(t *testing.T) {
TaskList,
),
)
testutil.DoTestCaseFile(markdown, "_test/tasklist.txt", t)
testutil.DoTestCaseFile(markdown, "_test/tasklist.txt", t, testutil.ParseCliCaseArg()...)
}

View file

@ -1,6 +1,8 @@
package extension
import (
"unicode"
"github.com/yuin/goldmark"
gast "github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/parser"
@ -8,29 +10,52 @@ import (
"github.com/yuin/goldmark/util"
)
var uncloseCounterKey = parser.NewContextKey()
type unclosedCounter struct {
Single int
Double int
}
func (u *unclosedCounter) Reset() {
u.Single = 0
u.Double = 0
}
func getUnclosedCounter(pc parser.Context) *unclosedCounter {
v := pc.Get(uncloseCounterKey)
if v == nil {
v = &unclosedCounter{}
pc.Set(uncloseCounterKey, v)
}
return v.(*unclosedCounter)
}
// TypographicPunctuation is a key of the punctuations that can be replaced with
// typographic entities.
type TypographicPunctuation int
const (
// LeftSingleQuote is '
// LeftSingleQuote is ' .
LeftSingleQuote TypographicPunctuation = iota + 1
// RightSingleQuote is '
// RightSingleQuote is ' .
RightSingleQuote
// LeftDoubleQuote is "
// LeftDoubleQuote is " .
LeftDoubleQuote
// RightDoubleQuote is "
// RightDoubleQuote is " .
RightDoubleQuote
// EnDash is --
// EnDash is -- .
EnDash
// EmDash is ---
// EmDash is --- .
EmDash
// Ellipsis is ...
// Ellipsis is ... .
Ellipsis
// LeftAngleQuote is <<
// LeftAngleQuote is << .
LeftAngleQuote
// RightAngleQuote is >>
// RightAngleQuote is >> .
RightAngleQuote
// Apostrophe is ' .
Apostrophe
typographicPunctuationMax
)
@ -52,6 +77,7 @@ func newDefaultSubstitutions() [][]byte {
replacements[Ellipsis] = []byte("&hellip;")
replacements[LeftAngleQuote] = []byte("&laquo;")
replacements[RightAngleQuote] = []byte("&raquo;")
replacements[Apostrophe] = []byte("&rsquo;")
return replacements
}
@ -89,10 +115,10 @@ func (o *withTypographicSubstitutions) SetTypographerOption(p *TypographerConfig
// WithTypographicSubstitutions is a functional otpion that specify replacement text
// for punctuations.
func WithTypographicSubstitutions(values map[TypographicPunctuation][]byte) TypographerOption {
func WithTypographicSubstitutions[T []byte | string](values map[TypographicPunctuation]T) TypographerOption {
replacements := newDefaultSubstitutions()
for k, v := range values {
replacements[k] = v
replacements[k] = []byte(v)
}
return &withTypographicSubstitutions{replacements}
@ -134,11 +160,10 @@ func NewTypographerParser(opts ...TypographerOption) parser.InlineParser {
}
func (s *typographerParser) Trigger() []byte {
return []byte{'\'', '"', '-', '.', '<', '>'}
return []byte{'\'', '"', '-', '.', ',', '<', '>', '*', '['}
}
func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser.Context) gast.Node {
before := block.PrecendingCharacter()
line, _ := block.PeekLine()
c := line[0]
if len(line) > 2 {
@ -184,22 +209,89 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
}
}
if c == '\'' || c == '"' {
before := block.PrecendingCharacter()
d := parser.ScanDelimiter(line, before, 1, defaultTypographerDelimiterProcessor)
if d == nil {
return nil
}
counter := getUnclosedCounter(pc)
if c == '\'' {
if s.Substitutions[Apostrophe] != nil {
// Handle decade abbrevations such as '90s
if d.CanOpen && !d.CanClose && len(line) > 3 &&
util.IsNumeric(line[1]) && util.IsNumeric(line[2]) && line[3] == 's' {
after := rune(' ')
if len(line) > 4 {
after = util.ToRune(line, 4)
}
if len(line) == 3 || util.IsSpaceRune(after) || util.IsPunctRune(after) {
node := gast.NewString(s.Substitutions[Apostrophe])
node.SetCode(true)
block.Advance(1)
return node
}
}
// special cases: 'twas, 'em, 'net
if len(line) > 1 && (unicode.IsPunct(before) || unicode.IsSpace(before)) &&
(line[1] == 't' || line[1] == 'e' || line[1] == 'n' || line[1] == 'l') {
node := gast.NewString(s.Substitutions[Apostrophe])
node.SetCode(true)
block.Advance(1)
return node
}
// Convert normal apostrophes. This is probably more flexible than necessary but
// converts any apostrophe in between two alphanumerics.
if len(line) > 1 && (unicode.IsDigit(before) || unicode.IsLetter(before)) &&
(unicode.IsLetter(util.ToRune(line, 1))) {
node := gast.NewString(s.Substitutions[Apostrophe])
node.SetCode(true)
block.Advance(1)
return node
}
}
if s.Substitutions[LeftSingleQuote] != nil && d.CanOpen && !d.CanClose {
node := gast.NewString(s.Substitutions[LeftSingleQuote])
nt := LeftSingleQuote
// special cases: Alice's, I'm, Don't, You'd
if len(line) > 1 && (line[1] == 's' || line[1] == 'm' || line[1] == 't' || line[1] == 'd') &&
(len(line) < 3 || util.IsPunct(line[2]) || util.IsSpace(line[2])) {
nt = RightSingleQuote
}
// special cases: I've, I'll, You're
if len(line) > 2 && ((line[1] == 'v' && line[2] == 'e') ||
(line[1] == 'l' && line[2] == 'l') || (line[1] == 'r' && line[2] == 'e')) &&
(len(line) < 4 || util.IsPunct(line[3]) || util.IsSpace(line[3])) {
nt = RightSingleQuote
}
if nt == LeftSingleQuote {
counter.Single++
}
node := gast.NewString(s.Substitutions[nt])
node.SetCode(true)
block.Advance(1)
return node
}
if s.Substitutions[RightSingleQuote] != nil && d.CanClose && !d.CanOpen {
node := gast.NewString(s.Substitutions[RightSingleQuote])
node.SetCode(true)
block.Advance(1)
return node
if s.Substitutions[RightSingleQuote] != nil {
// plural possesive and abbreviations: Smiths', doin'
if len(line) > 1 && unicode.IsSpace(util.ToRune(line, 0)) || unicode.IsPunct(util.ToRune(line, 0)) &&
(len(line) > 2 && !unicode.IsDigit(util.ToRune(line, 1))) {
node := gast.NewString(s.Substitutions[RightSingleQuote])
node.SetCode(true)
block.Advance(1)
return node
}
}
if s.Substitutions[RightSingleQuote] != nil && counter.Single > 0 {
isClose := d.CanClose && !d.CanOpen
maybeClose := d.CanClose && d.CanOpen && len(line) > 1 && unicode.IsPunct(util.ToRune(line, 1)) &&
(len(line) == 2 || (len(line) > 2 && util.IsPunct(line[2]) || util.IsSpace(line[2])))
if isClose || maybeClose {
node := gast.NewString(s.Substitutions[RightSingleQuote])
node.SetCode(true)
block.Advance(1)
counter.Single--
return node
}
}
}
if c == '"' {
@ -207,13 +299,24 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
node := gast.NewString(s.Substitutions[LeftDoubleQuote])
node.SetCode(true)
block.Advance(1)
counter.Double++
return node
}
if s.Substitutions[RightDoubleQuote] != nil && d.CanClose && !d.CanOpen {
node := gast.NewString(s.Substitutions[RightDoubleQuote])
node.SetCode(true)
block.Advance(1)
return node
if s.Substitutions[RightDoubleQuote] != nil && counter.Double > 0 {
isClose := d.CanClose && !d.CanOpen
maybeClose := d.CanClose && d.CanOpen && len(line) > 1 && (unicode.IsPunct(util.ToRune(line, 1))) &&
(len(line) == 2 || (len(line) > 2 && util.IsPunct(line[2]) || util.IsSpace(line[2])))
if isClose || maybeClose {
// special case: "Monitor 21""
if len(line) > 1 && line[1] == '"' && unicode.IsDigit(before) {
return nil
}
node := gast.NewString(s.Substitutions[RightDoubleQuote])
node.SetCode(true)
block.Advance(1)
counter.Double--
return node
}
}
}
}
@ -221,17 +324,17 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
}
func (s *typographerParser) CloseBlock(parent gast.Node, pc parser.Context) {
// nothing to do
getUnclosedCounter(pc).Reset()
}
type typographer struct {
options []TypographerOption
}
// Typographer is an extension that repalace punctuations with typographic entities.
// Typographer is an extension that replaces punctuations with typographic entities.
var Typographer = &typographer{}
// NewTypographer returns a new Entender that repalace punctuations with typographic entities.
// NewTypographer returns a new Extender that replaces punctuations with typographic entities.
func NewTypographer(opts ...TypographerOption) goldmark.Extender {
return &typographer{
options: opts,

View file

@ -4,8 +4,8 @@ import (
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
func TestTypographer(t *testing.T) {
@ -17,5 +17,5 @@ func TestTypographer(t *testing.T) {
Typographer,
),
)
testutil.DoTestCaseFile(markdown, "_test/typographer.txt", t)
testutil.DoTestCaseFile(markdown, "_test/typographer.txt", t, testutil.ParseCliCaseArg()...)
}

View file

@ -1,17 +1,221 @@
package goldmark_test
import (
"bytes"
"os"
"strconv"
"strings"
"testing"
"time"
. "github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
)
func TestDefinitionList(t *testing.T) {
var testTimeoutMultiplier = 1.0
func init() {
m, err := strconv.ParseFloat(os.Getenv("GOLDMARK_TEST_TIMEOUT_MULTIPLIER"), 64)
if err == nil {
testTimeoutMultiplier = m
}
}
func TestExtras(t *testing.T) {
markdown := New(WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
testutil.DoTestCaseFile(markdown, "_test/extra.txt", t)
testutil.DoTestCaseFile(markdown, "_test/extra.txt", t, testutil.ParseCliCaseArg()...)
}
func TestEndsWithNonSpaceCharacters(t *testing.T) {
markdown := New(WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
source := []byte("```\na\n```")
var b bytes.Buffer
err := markdown.Convert(source, &b)
if err != nil {
t.Error(err.Error())
}
if b.String() != "<pre><code>a\n</code></pre>\n" {
t.Errorf("%s \n---------\n %s", source, b.String())
}
}
func TestWindowsNewLine(t *testing.T) {
markdown := New(WithRendererOptions(
html.WithXHTML(),
))
source := []byte("a \r\nb\n")
var b bytes.Buffer
err := markdown.Convert(source, &b)
if err != nil {
t.Error(err.Error())
}
if b.String() != "<p>a<br />\nb</p>\n" {
t.Errorf("%s\n---------\n%s", source, b.String())
}
source = []byte("a\\\r\nb\r\n")
var b2 bytes.Buffer
err = markdown.Convert(source, &b2)
if err != nil {
t.Error(err.Error())
}
if b2.String() != "<p>a<br />\nb</p>\n" {
t.Errorf("\n%s\n---------\n%s", source, b2.String())
}
}
type myIDs struct {
}
func (s *myIDs) Generate(value []byte, kind ast.NodeKind) []byte {
return []byte("my-id")
}
func (s *myIDs) Put(value []byte) {
}
func TestAutogeneratedIDs(t *testing.T) {
ctx := parser.NewContext(parser.WithIDs(&myIDs{}))
markdown := New(WithParserOptions(parser.WithAutoHeadingID()))
source := []byte("# Title1\n## Title2")
var b bytes.Buffer
err := markdown.Convert(source, &b, parser.WithContext(ctx))
if err != nil {
t.Error(err.Error())
}
if b.String() != `<h1 id="my-id">Title1</h1>
<h2 id="my-id">Title2</h2>
` {
t.Errorf("%s\n---------\n%s", source, b.String())
}
}
func nowMillis() int64 {
// TODO: replace UnixNano to UnixMillis(drops Go1.16 support)
return time.Now().UnixNano() / 1000000
}
func TestDeepNestedLabelPerformance(t *testing.T) {
if testing.Short() {
t.Skip("skipping performance test in short mode")
}
markdown := New(WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
started := nowMillis()
n := 50000
source := []byte(strings.Repeat("[", n) + strings.Repeat("]", n))
var b bytes.Buffer
_ = markdown.Convert(source, &b)
finished := nowMillis()
if (finished - started) > int64(5000*testTimeoutMultiplier) {
t.Error("Parsing deep nested labels took too long")
}
}
func TestManyProcessingInstructionPerformance(t *testing.T) {
if testing.Short() {
t.Skip("skipping performance test in short mode")
}
markdown := New(WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
started := nowMillis()
n := 50000
source := []byte("a " + strings.Repeat("<?", n))
var b bytes.Buffer
_ = markdown.Convert(source, &b)
finished := nowMillis()
if (finished - started) > int64(5000*testTimeoutMultiplier) {
t.Error("Parsing processing instructions took too long")
}
}
func TestManyCDATAPerformance(t *testing.T) {
if testing.Short() {
t.Skip("skipping performance test in short mode")
}
markdown := New(WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
started := nowMillis()
n := 50000
source := []byte(strings.Repeat("a <![CDATA[", n))
var b bytes.Buffer
_ = markdown.Convert(source, &b)
finished := nowMillis()
if (finished - started) > int64(5000*testTimeoutMultiplier) {
t.Error("Parsing processing instructions took too long")
}
}
func TestManyDeclPerformance(t *testing.T) {
if testing.Short() {
t.Skip("skipping performance test in short mode")
}
markdown := New(WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
started := nowMillis()
n := 50000
source := []byte(strings.Repeat("a <!A ", n))
var b bytes.Buffer
_ = markdown.Convert(source, &b)
finished := nowMillis()
if (finished - started) > int64(5000*testTimeoutMultiplier) {
t.Error("Parsing processing instructions took too long")
}
}
func TestManyCommentPerformance(t *testing.T) {
if testing.Short() {
t.Skip("skipping performance test in short mode")
}
markdown := New(WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
))
started := nowMillis()
n := 50000
source := []byte(strings.Repeat("a <!-- ", n))
var b bytes.Buffer
_ = markdown.Convert(source, &b)
finished := nowMillis()
if (finished - started) > int64(5000*testTimeoutMultiplier) {
t.Error("Parsing processing instructions took too long")
}
}
func TestDangerousURLStringCase(t *testing.T) {
markdown := New()
source := []byte(`[Basic](javascript:alert('Basic'))
[CaseInsensitive](JaVaScRiPt:alert('CaseInsensitive'))
`)
expected := []byte(`<p><a href="">Basic</a>
<a href="">CaseInsensitive</a></p>
`)
var b bytes.Buffer
_ = markdown.Convert(source, &b)
if !bytes.Equal(expected, b.Bytes()) {
t.Error("Dangerous URL should ignore cases:\n" + string(testutil.DiffPretty(expected, b.Bytes())))
}
}

View file

@ -1,28 +0,0 @@
package fuzz
import (
"bytes"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
"github.com/yuin/goldmark/renderer/html"
)
func Fuzz(data []byte) int {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
extension.DefinitionList,
extension.Footnote,
extension.GFM,
extension.Typographer,
),
)
var b bytes.Buffer
if err := markdown.Convert(data, &b); err != nil {
return 0
}
return 1
}

View file

@ -2,40 +2,56 @@ package fuzz
import (
"bytes"
"fmt"
"io/ioutil"
"encoding/json"
"os"
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/extension"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/util"
)
var _ = fmt.Printf
func fuzz(f *testing.F) {
f.Fuzz(func(t *testing.T, orig string) {
markdown := goldmark.New(
goldmark.WithParserOptions(
parser.WithAutoHeadingID(),
parser.WithAttribute(),
),
goldmark.WithRendererOptions(
html.WithUnsafe(),
html.WithXHTML(),
),
goldmark.WithExtensions(
extension.DefinitionList,
extension.Footnote,
extension.GFM,
extension.Typographer,
extension.Linkify,
extension.Table,
extension.TaskList,
),
)
var b bytes.Buffer
if err := markdown.Convert(util.StringToReadOnlyBytes(orig), &b); err != nil {
panic(err)
}
})
}
func TestFuzz(t *testing.T) {
crasher := "6dff3d03167cb144d4e2891edac76ee740a77bc7"
data, err := ioutil.ReadFile("crashers/" + crasher)
func FuzzDefault(f *testing.F) {
bs, err := os.ReadFile("../_test/spec.json")
if err != nil {
return
}
fmt.Printf("%s\n", util.VisualizeSpaces(data))
fmt.Println("||||||||||||||||||||||")
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
extension.DefinitionList,
extension.Footnote,
extension.GFM,
extension.Typographer,
),
)
var b bytes.Buffer
if err := markdown.Convert(data, &b); err != nil {
panic(err)
}
fmt.Println(b.String())
var testCases []map[string]interface{}
if err := json.Unmarshal(bs, &testCases); err != nil {
panic(err)
}
for _, c := range testCases {
f.Add(c["markdown"])
}
fuzz(f)
}

9
fuzz/oss_fuzz_test.go Normal file
View file

@ -0,0 +1,9 @@
package fuzz
import (
"testing"
)
func FuzzOss(f *testing.F) {
fuzz(f)
}

2
go.mod
View file

@ -1,3 +1,3 @@
module github.com/yuin/goldmark
go 1.12
go 1.19

View file

@ -2,12 +2,13 @@
package goldmark
import (
"io"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"io"
)
// DefaultParser returns a new Parser that is configured by default values.

View file

@ -4,8 +4,8 @@ import (
"testing"
. "github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/testutil"
)
func TestAttributeAndAutoHeadingID(t *testing.T) {
@ -15,5 +15,5 @@ func TestAttributeAndAutoHeadingID(t *testing.T) {
parser.WithAutoHeadingID(),
),
)
testutil.DoTestCaseFile(markdown, "_test/options.txt", t)
testutil.DoTestCaseFile(markdown, "_test/options.txt", t, testutil.ParseCliCaseArg()...)
}

View file

@ -2,15 +2,17 @@ package parser
import (
"bytes"
"io"
"strconv"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"strconv"
)
var attrNameID = []byte("id")
var attrNameClass = []byte("class")
// An Attribute is an attribute of the markdown elements
// An Attribute is an attribute of the markdown elements.
type Attribute struct {
Name []byte
Value interface{}
@ -63,11 +65,9 @@ func ParseAttributes(reader text.Reader) (Attributes, bool) {
}
if bytes.Equal(attr.Name, attrNameClass) {
if !attrs.findUpdate(attrNameClass, func(v interface{}) interface{} {
var ret interface{}
if ret, ok = v.([][]byte); !ok {
ret = [][]byte{v.([]byte)}
}
return append(ret.([][]byte), attr.Value.([]byte))
ret := make([]byte, 0, len(v.([]byte))+1+len(attr.Value.([]byte)))
ret = append(ret, v.([]byte)...)
return append(append(ret, ' '), attr.Value.([]byte)...)
}) {
attrs = append(attrs, attr)
}
@ -89,7 +89,12 @@ func parseAttribute(reader text.Reader) (Attribute, bool) {
reader.Advance(1)
line, _ := reader.PeekLine()
i := 0
for ; i < len(line) && !util.IsSpace(line[i]) && (!util.IsPunct(line[i]) || line[i] == '_' || line[i] == '-'); i++ {
// HTML5 allows any kind of characters as id, but XHTML restricts characters for id.
// CommonMark is basically defined for XHTML(even though it is legacy).
// So we restrict id characters.
for ; i < len(line) && !util.IsSpace(line[i]) &&
(!util.IsPunct(line[i]) || line[i] == '_' ||
line[i] == '-' || line[i] == ':' || line[i] == '.'); i++ {
}
name := attrNameClass
if c == '#' {
@ -99,6 +104,9 @@ func parseAttribute(reader text.Reader) (Attribute, bool) {
return Attribute{Name: name, Value: line[0:i]}, true
}
line, _ := reader.PeekLine()
if len(line) == 0 {
return Attribute{}, false
}
c = line[0]
if !((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
c == '_' || c == ':') {
@ -106,7 +114,7 @@ func parseAttribute(reader text.Reader) (Attribute, bool) {
}
i := 0
for ; i < len(line); i++ {
c := line[i]
c = line[i]
if !((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') ||
(c >= '0' && c <= '9') ||
c == '_' || c == ':' || c == '.' || c == '-') {
@ -126,15 +134,19 @@ func parseAttribute(reader text.Reader) (Attribute, bool) {
if !ok {
return Attribute{}, false
}
if bytes.Equal(name, attrNameClass) {
if _, ok = value.([]byte); !ok {
return Attribute{}, false
}
}
return Attribute{Name: name, Value: value}, true
}
func parseAttributeValue(reader text.Reader) (interface{}, bool) {
reader.SkipSpaces()
c := reader.Peek()
var value interface{}
ok := false
var ok bool
switch c {
case text.EOF:
return Attribute{}, false
@ -155,7 +167,6 @@ func parseAttributeValue(reader text.Reader) (interface{}, bool) {
return nil, false
}
return value, true
}
func parseAttributeArray(reader text.Reader) ([]interface{}, bool) {
@ -230,11 +241,11 @@ func parseAttributeString(reader text.Reader) ([]byte, bool) {
return nil, false
}
func scanAttributeDecimal(reader text.Reader, w *bytes.Buffer) {
func scanAttributeDecimal(reader text.Reader, w io.ByteWriter) {
for {
c := reader.Peek()
if util.IsNumeric(c) {
w.WriteByte(c)
_ = w.WriteByte(c)
} else {
return
}
@ -276,7 +287,7 @@ func parseAttributeNumber(reader text.Reader) (float64, bool) {
}
scanAttributeDecimal(reader, &buf)
}
f, err := strconv.ParseFloat(buf.String(), 10)
f, err := strconv.ParseFloat(buf.String(), 64)
if err != nil {
return 0, false
}

View file

@ -13,7 +13,7 @@ type HeadingConfig struct {
}
// SetOption implements SetOptioner.
func (b *HeadingConfig) SetOption(name OptionName, value interface{}) {
func (b *HeadingConfig) SetOption(name OptionName, _ interface{}) {
switch name {
case optAutoHeadingID:
b.AutoHeadingID = true
@ -91,11 +91,17 @@ func (b *atxHeadingParser) Open(parent ast.Node, reader text.Reader, pc Context)
if i == pos || level > 6 {
return nil, NoChildren
}
if i == len(line) { // alone '#' (without a new line character)
return ast.NewHeading(level), NoChildren
}
l := util.TrimLeftSpaceLength(line[i:])
if l == 0 {
return nil, NoChildren
}
start := i + l
if start >= len(line) {
start = len(line) - 1
}
origstart := start
stop := len(line) - util.TrimRightSpaceLength(line)
@ -105,30 +111,33 @@ func (b *atxHeadingParser) Open(parent ast.Node, reader text.Reader, pc Context)
start--
closureClose := -1
closureOpen := -1
for i := start; i < stop; {
c := line[i]
if util.IsEscapedPunctuation(line, i) {
i += 2
} else if util.IsSpace(c) && i < stop-1 && line[i+1] == '#' {
closureOpen = i + 1
j := i + 1
for ; j < stop && line[j] == '#'; j++ {
for j := start; j < stop; {
c := line[j]
if util.IsEscapedPunctuation(line, j) {
j += 2
} else if util.IsSpace(c) && j < stop-1 && line[j+1] == '#' {
closureOpen = j + 1
k := j + 1
for ; k < stop && line[k] == '#'; k++ {
}
closureClose = j
closureClose = k
break
} else {
i++
j++
}
}
if closureClose > 0 {
reader.Advance(closureClose)
attrs, ok := ParseAttributes(reader)
parsed = ok
rest, _ := reader.PeekLine()
parsed = ok && util.IsBlank(rest)
if parsed {
for _, attr := range attrs {
node.SetAttribute(attr.Name, attr.Value)
}
node.Lines().Append(text.NewSegment(segment.Start+start+1, segment.Start+closureOpen))
node.Lines().Append(text.NewSegment(
segment.Start+start+1-segment.Padding,
segment.Start+closureOpen-segment.Padding))
}
}
}
@ -136,7 +145,7 @@ func (b *atxHeadingParser) Open(parent ast.Node, reader text.Reader, pc Context)
start = origstart
stop := len(line) - util.TrimRightSpaceLength(line)
if stop <= start { // empty headings like '##[space]'
stop = start + 1
stop = start
} else {
i = stop - 1
for ; line[i] == '#' && i >= start; i-- {
@ -149,7 +158,7 @@ func (b *atxHeadingParser) Open(parent ast.Node, reader text.Reader, pc Context)
}
if len(util.TrimRight(line[start:stop], []byte{'#'})) != 0 { // empty heading like '### ###'
node.Lines().Append(text.NewSegment(segment.Start+start, segment.Start+stop))
node.Lines().Append(text.NewSegment(segment.Start+start-segment.Padding, segment.Start+stop-segment.Padding))
}
}
return node, NoChildren
@ -168,9 +177,11 @@ func (b *atxHeadingParser) Close(node ast.Node, reader text.Reader, pc Context)
}
if b.AutoHeadingID {
_, ok := node.AttributeString("id")
id, ok := node.AttributeString("id")
if !ok {
generateAutoHeadingID(node.(*ast.Heading), reader, pc)
} else {
pc.IDs().Put(id.([]byte))
}
}
}
@ -183,18 +194,22 @@ func (b *atxHeadingParser) CanAcceptIndentedLine() bool {
return false
}
var attrAutoHeadingIDPrefix = []byte("heading")
func generateAutoHeadingID(node *ast.Heading, reader text.Reader, pc Context) {
var line []byte
lastIndex := node.Lines().Len() - 1
lastLine := node.Lines().At(lastIndex)
line := lastLine.Value(reader.Source())
headingID := pc.IDs().Generate(line, attrAutoHeadingIDPrefix)
if lastIndex > -1 {
lastLine := node.Lines().At(lastIndex)
line = lastLine.Value(reader.Source())
}
headingID := pc.IDs().Generate(line, ast.KindHeading)
node.SetAttribute(attrNameID, headingID)
}
func parseLastLineAttributes(node ast.Node, reader text.Reader, pc Context) {
lastIndex := node.Lines().Len() - 1
if lastIndex < 0 { // empty headings
return
}
lastLine := node.Lines().At(lastIndex)
line := lastLine.Value(reader.Source())
lr := text.NewReader(line)
@ -223,7 +238,7 @@ func parseLastLineAttributes(node ast.Node, reader text.Reader, pc Context) {
}
lr.Advance(1)
}
if ok && util.IsBlank(line[end.Stop:]) {
if ok && util.IsBlank(line[end.Start:]) {
for _, attr := range attrs {
node.SetAttribute(attr.Name, attr.Value)
}

View file

@ -19,7 +19,7 @@ func NewBlockquoteParser() BlockParser {
func (b *blockquoteParser) process(reader text.Reader) bool {
line, _ := reader.PeekLine()
w, pos := util.IndentWidth(line, 0)
w, pos := util.IndentWidth(line, reader.LineOffset())
if w > 3 || pos >= len(line) || line[pos] != '>' {
return false
}
@ -28,12 +28,13 @@ func (b *blockquoteParser) process(reader text.Reader) bool {
reader.Advance(pos)
return true
}
if line[pos] == ' ' || line[pos] == '\t' {
pos++
}
reader.Advance(pos)
if line[pos-1] == '\t' {
reader.SetPadding(2)
if line[pos] == ' ' || line[pos] == '\t' {
padding := 0
if line[pos] == '\t' {
padding = util.TabWidth(reader.LineOffset()) - 1
}
reader.AdvanceAndSetPadding(1, padding)
}
return true
}

View file

@ -31,6 +31,11 @@ func (b *codeBlockParser) Open(parent ast.Node, reader text.Reader, pc Context)
node := ast.NewCodeBlock()
reader.AdvanceAndSetPadding(pos, padding)
_, segment = reader.PeekLine()
// if code block line starts with a tab, keep a tab as it is.
if segment.Padding != 0 {
preserveLeadingTabInCodeBlock(&segment, reader, 0)
}
segment.ForceNewline = true
node.Lines().Append(segment)
reader.Advance(segment.Len() - 1)
return node, NoChildren
@ -49,6 +54,13 @@ func (b *codeBlockParser) Continue(node ast.Node, reader text.Reader, pc Context
}
reader.AdvanceAndSetPadding(pos, padding)
_, segment = reader.PeekLine()
// if code block line starts with a tab, keep a tab as it is.
if segment.Padding != 0 {
preserveLeadingTabInCodeBlock(&segment, reader, 0)
}
segment.ForceNewline = true
node.Lines().Append(segment)
reader.Advance(segment.Len() - 1)
return Continue | NoChildren
@ -77,3 +89,14 @@ func (b *codeBlockParser) CanInterruptParagraph() bool {
func (b *codeBlockParser) CanAcceptIndentedLine() bool {
return true
}
func preserveLeadingTabInCodeBlock(segment *text.Segment, reader text.Reader, indent int) {
offsetWithPadding := reader.LineOffset() + indent
sl, ss := reader.Position()
reader.SetPosition(sl, text.NewSegment(ss.Start-1, ss.Stop))
if offsetWithPadding == reader.LineOffset() {
segment.Padding = 0
segment.Start--
}
reader.SetPosition(sl, ss)
}

View file

@ -3,7 +3,6 @@ package parser
import (
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
)
type codeSpanParser struct {
@ -42,8 +41,8 @@ func (s *codeSpanParser) Parse(parent ast.Node, block text.Reader, pc Context) a
for ; i < len(line) && line[i] == '`'; i++ {
}
closure := i - oldi
if closure == opener && (i+1 >= len(line) || line[i+1] != '`') {
segment := segment.WithStop(segment.Start + i - closure)
if closure == opener && (i >= len(line) || line[i] != '`') {
segment = segment.WithStop(segment.Start + i - closure)
if !segment.IsEmpty() {
node.AppendChild(node, ast.NewRawTextSegment(segment))
}
@ -52,9 +51,7 @@ func (s *codeSpanParser) Parse(parent ast.Node, block text.Reader, pc Context) a
}
}
}
if !util.IsBlank(line) {
node.AppendChild(node, ast.NewRawTextSegment(segment))
}
node.AppendChild(node, ast.NewRawTextSegment(segment))
block.AdvanceLine()
}
end:
@ -62,11 +59,11 @@ end:
// trim first halfspace and last halfspace
segment := node.FirstChild().(*ast.Text).Segment
shouldTrimmed := true
if !(!segment.IsEmpty() && block.Source()[segment.Start] == ' ') {
if !(!segment.IsEmpty() && isSpaceOrNewline(block.Source()[segment.Start])) {
shouldTrimmed = false
}
segment = node.LastChild().(*ast.Text).Segment
if !(!segment.IsEmpty() && block.Source()[segment.Stop-1] == ' ') {
if !(!segment.IsEmpty() && isSpaceOrNewline(block.Source()[segment.Stop-1])) {
shouldTrimmed = false
}
if shouldTrimmed {
@ -81,3 +78,7 @@ end:
}
return node
}
func isSpaceOrNewline(c byte) bool {
return c == ' ' || c == '\n'
}

View file

@ -3,7 +3,6 @@ package parser
import (
"fmt"
"strings"
"unicode"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/text"
@ -11,7 +10,7 @@ import (
)
// A DelimiterProcessor interface provides a set of functions about
// Deliiter nodes.
// Delimiter nodes.
type DelimiterProcessor interface {
// IsDelimiter returns true if given character is a delimiter, otherwise false.
IsDelimiter(byte) bool
@ -31,14 +30,14 @@ type Delimiter struct {
Segment text.Segment
// CanOpen is set true if this delimiter can open a span for a new node.
// See https://spec.commonmark.org/0.29/#can-open-emphasis for details.
// See https://spec.commonmark.org/0.30/#can-open-emphasis for details.
CanOpen bool
// CanClose is set true if this delimiter can close a span for a new node.
// See https://spec.commonmark.org/0.29/#can-open-emphasis for details.
// See https://spec.commonmark.org/0.30/#can-open-emphasis for details.
CanClose bool
// Length is a remaining length of this delmiter.
// Length is a remaining length of this delimiter.
Length int
// OriginalLength is a original length of this delimiter.
@ -67,12 +66,12 @@ func (d *Delimiter) Dump(source []byte, level int) {
var kindDelimiter = ast.NewNodeKind("Delimiter")
// Kind implements Node.Kind
// Kind implements Node.Kind.
func (d *Delimiter) Kind() ast.NodeKind {
return kindDelimiter
}
// Text implements Node.Text
// Text implements Node.Text.
func (d *Delimiter) Text(source []byte) []byte {
return d.Segment.Value(source)
}
@ -127,15 +126,15 @@ func ScanDelimiter(line []byte, before rune, min int, processor DelimiterProcess
after = util.ToRune(line, j)
}
isLeft, isRight, canOpen, canClose := false, false, false, false
beforeIsPunctuation := unicode.IsPunct(before)
beforeIsWhitespace := unicode.IsSpace(before)
afterIsPunctuation := unicode.IsPunct(after)
afterIsWhitespace := unicode.IsSpace(after)
var canOpen, canClose bool
beforeIsPunctuation := util.IsPunctRune(before)
beforeIsWhitespace := util.IsSpaceRune(before)
afterIsPunctuation := util.IsPunctRune(after)
afterIsWhitespace := util.IsSpaceRune(after)
isLeft = !afterIsWhitespace &&
isLeft := !afterIsWhitespace &&
(!afterIsPunctuation || beforeIsWhitespace || beforeIsPunctuation)
isRight = !beforeIsWhitespace &&
isRight := !beforeIsWhitespace &&
(!beforeIsPunctuation || afterIsWhitespace || afterIsPunctuation)
if line[i] == '_' {
@ -156,20 +155,19 @@ func ScanDelimiter(line []byte, before rune, min int, processor DelimiterProcess
// If you implement an inline parser that can have other inline nodes as
// children, you should call this function when nesting span has closed.
func ProcessDelimiters(bottom ast.Node, pc Context) {
if pc.LastDelimiter() == nil {
lastDelimiter := pc.LastDelimiter()
if lastDelimiter == nil {
return
}
var closer *Delimiter
if bottom != nil {
for c := pc.LastDelimiter().PreviousSibling(); c != nil; {
if d, ok := c.(*Delimiter); ok {
closer = d
if bottom != lastDelimiter {
for c := lastDelimiter.PreviousSibling(); c != nil && c != bottom; {
if d, ok := c.(*Delimiter); ok {
closer = d
}
c = c.PreviousSibling()
}
prev := c.PreviousSibling()
if prev == bottom {
break
}
c = prev
}
} else {
closer = pc.FirstDelimiter()
@ -187,7 +185,7 @@ func ProcessDelimiters(bottom ast.Node, pc Context) {
found := false
maybeOpener := false
var opener *Delimiter
for opener = closer.PreviousDelimiter; opener != nil; opener = opener.PreviousDelimiter {
for opener = closer.PreviousDelimiter; opener != nil && opener != bottom; opener = opener.PreviousDelimiter {
if opener.CanOpen && opener.Processor.CanOpenCloser(opener, closer) {
maybeOpener = true
consume = opener.CalcComsumption(closer)
@ -198,10 +196,11 @@ func ProcessDelimiters(bottom ast.Node, pc Context) {
}
}
if !found {
next := closer.NextDelimiter
if !maybeOpener && !closer.CanOpen {
pc.RemoveDelimiter(closer)
}
closer = closer.NextDelimiter
closer = next
continue
}
opener.ConsumeCharacters(consume)

View file

@ -71,20 +71,36 @@ func (b *fencedCodeBlockParser) Open(parent ast.Node, reader text.Reader, pc Con
func (b *fencedCodeBlockParser) Continue(node ast.Node, reader text.Reader, pc Context) State {
line, segment := reader.PeekLine()
fdata := pc.Get(fencedCodeBlockInfoKey).(*fenceData)
w, pos := util.IndentWidth(line, 0)
w, pos := util.IndentWidth(line, reader.LineOffset())
if w < 4 {
i := pos
for ; i < len(line) && line[i] == fdata.char; i++ {
}
length := i - pos
if length >= fdata.length && util.IsBlank(line[i:]) {
reader.Advance(segment.Stop - segment.Start - 1 - segment.Padding)
newline := 1
if line[len(line)-1] != '\n' {
newline = 0
}
reader.Advance(segment.Stop - segment.Start - newline + segment.Padding)
return Close
}
}
pos, padding := util.DedentPositionPadding(line, reader.LineOffset(), segment.Padding, fdata.indent)
pos, padding := util.IndentPositionPadding(line, reader.LineOffset(), segment.Padding, fdata.indent)
if pos < 0 {
pos = util.FirstNonSpacePosition(line)
if pos < 0 {
pos = 0
}
padding = 0
}
seg := text.NewSegmentPadding(segment.Start+pos, segment.Stop, padding)
// if code block line starts with a tab, keep a tab as it is.
if padding != 0 {
preserveLeadingTabInCodeBlock(&seg, reader, fdata.indent)
}
seg.ForceNewline = true // EOF as newline
node.Lines().Append(seg)
reader.AdvanceAndSetPadding(segment.Stop-segment.Start-pos-1, padding)
return Continue | NoChildren

View file

@ -2,11 +2,12 @@ package parser
import (
"bytes"
"regexp"
"strings"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"regexp"
"strings"
)
var allowedBlockTags = map[string]bool{
@ -60,8 +61,8 @@ var allowedBlockTags = map[string]bool{
"option": true,
"p": true,
"param": true,
"search": true,
"section": true,
"source": true,
"summary": true,
"table": true,
"tbody": true,
@ -75,8 +76,8 @@ var allowedBlockTags = map[string]bool{
"ul": true,
}
var htmlBlockType1OpenRegexp = regexp.MustCompile(`(?i)^[ ]{0,3}<(script|pre|style)(?:\s.*|>.*|/>.*|)\n?$`)
var htmlBlockType1CloseRegexp = regexp.MustCompile(`(?i)^[ ]{0,3}(?:[^ ].*|)</(?:script|pre|style)>.*`)
var htmlBlockType1OpenRegexp = regexp.MustCompile(`(?i)^[ ]{0,3}<(script|pre|style|textarea)(?:\s.*|>.*|/>.*|)(?:\r\n|\n)?$`) //nolint:golint,lll
var htmlBlockType1CloseRegexp = regexp.MustCompile(`(?i)^.*</(?:script|pre|style|textarea)>.*`)
var htmlBlockType2OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<!\-\-`)
var htmlBlockType2Close = []byte{'-', '-', '>'}
@ -84,25 +85,25 @@ var htmlBlockType2Close = []byte{'-', '-', '>'}
var htmlBlockType3OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<\?`)
var htmlBlockType3Close = []byte{'?', '>'}
var htmlBlockType4OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<![A-Z]+.*\n?$`)
var htmlBlockType4OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<![A-Z]+.*(?:\r\n|\n)?$`)
var htmlBlockType4Close = []byte{'>'}
var htmlBlockType5OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<\!\[CDATA\[`)
var htmlBlockType5Close = []byte{']', ']', '>'}
var htmlBlockType6Regexp = regexp.MustCompile(`^[ ]{0,3}</?([a-zA-Z0-9]+)(?:\s.*|>.*|/>.*|)\n?$`)
var htmlBlockType6Regexp = regexp.MustCompile(`^[ ]{0,3}<(?:/[ ]*)?([a-zA-Z]+[a-zA-Z0-9\-]*)(?:[ ].*|>.*|/>.*|)(?:\r\n|\n)?$`) //nolint:golint,lll
var htmlBlockType7Regexp = regexp.MustCompile(`^[ ]{0,3}<(/)?([a-zA-Z0-9]+)(` + attributePattern + `*)(:?>|/>)\s*\n?$`)
var htmlBlockType7Regexp = regexp.MustCompile(`^[ ]{0,3}<(/[ ]*)?([a-zA-Z]+[a-zA-Z0-9\-]*)(` + attributePattern + `*)[ ]*(?:>|/>)[ ]*(?:\r\n|\n)?$`) //nolint:golint,lll
type htmlBlockParser struct {
}
var defaultHtmlBlockParser = &htmlBlockParser{}
var defaultHTMLBlockParser = &htmlBlockParser{}
// NewHTMLBlockParser return a new BlockParser that can parse html
// blocks.
func NewHTMLBlockParser() BlockParser {
return defaultHtmlBlockParser
return defaultHTMLBlockParser
}
func (b *htmlBlockParser) Trigger() []byte {
@ -117,9 +118,7 @@ func (b *htmlBlockParser) Open(parent ast.Node, reader text.Reader, pc Context)
return nil, NoChildren
}
tagName := ""
if m := htmlBlockType1OpenRegexp.FindSubmatchIndex(line); m != nil {
tagName = string(line[m[2]:m[3]])
node = ast.NewHTMLBlock(ast.HTMLBlockType1)
} else if htmlBlockType2OpenRegexp.Match(line) {
node = ast.NewHTMLBlock(ast.HTMLBlockType2)
@ -132,17 +131,18 @@ func (b *htmlBlockParser) Open(parent ast.Node, reader text.Reader, pc Context)
} else if match := htmlBlockType7Regexp.FindSubmatchIndex(line); match != nil {
isCloseTag := match[2] > -1 && bytes.Equal(line[match[2]:match[3]], []byte("/"))
hasAttr := match[6] != match[7]
tagName = strings.ToLower(string(line[match[4]:match[5]]))
_, ok := allowedBlockTags[strings.ToLower(string(tagName))]
tagName := strings.ToLower(string(line[match[4]:match[5]]))
_, ok := allowedBlockTags[tagName]
if ok {
node = ast.NewHTMLBlock(ast.HTMLBlockType6)
} else if tagName != "script" && tagName != "style" && tagName != "pre" && !ast.IsParagraph(last) && !(isCloseTag && hasAttr) { // type 7 can not interrupt paragraph
} else if tagName != "script" && tagName != "style" &&
tagName != "pre" && !ast.IsParagraph(last) && !(isCloseTag && hasAttr) { // type 7 can not interrupt paragraph
node = ast.NewHTMLBlock(ast.HTMLBlockType7)
}
}
if node == nil {
if match := htmlBlockType6Regexp.FindSubmatchIndex(line); match != nil {
tagName = string(line[match[2]:match[3]])
tagName := string(line[match[2]:match[3]])
_, ok := allowedBlockTags[strings.ToLower(tagName)]
if ok {
node = ast.NewHTMLBlock(ast.HTMLBlockType6)
@ -150,7 +150,7 @@ func (b *htmlBlockParser) Open(parent ast.Node, reader text.Reader, pc Context)
}
}
if node != nil {
reader.Advance(segment.Len() - 1)
reader.Advance(segment.Len() - util.TrimRightSpaceLength(line))
node.Lines().Append(segment)
return node, NoChildren
}
@ -173,7 +173,7 @@ func (b *htmlBlockParser) Continue(node ast.Node, reader text.Reader, pc Context
}
if htmlBlockType1CloseRegexp.Match(line) {
htmlBlock.ClosureLine = segment
reader.Advance(segment.Len() - 1)
reader.Advance(segment.Len() - util.TrimRightSpaceLength(line))
return Close
}
case ast.HTMLBlockType2:
@ -202,7 +202,7 @@ func (b *htmlBlockParser) Continue(node ast.Node, reader text.Reader, pc Context
}
if bytes.Contains(line, closurePattern) {
htmlBlock.ClosureLine = segment
reader.Advance(segment.Len() - 1)
reader.Advance(segment.Len())
return Close
}
@ -212,7 +212,7 @@ func (b *htmlBlockParser) Continue(node ast.Node, reader text.Reader, pc Context
}
}
node.Lines().Append(segment)
reader.Advance(segment.Len() - 1)
reader.Advance(segment.Len() - util.TrimRightSpaceLength(line))
return Continue | NoChildren
}

View file

@ -2,7 +2,6 @@ package parser
import (
"fmt"
"regexp"
"strings"
"github.com/yuin/goldmark/ast"
@ -49,6 +48,13 @@ func (s *linkLabelState) Kind() ast.NodeKind {
return kindLinkLabelState
}
func linkLabelStateLength(v *linkLabelState) int {
if v == nil || v.Last == nil || v.First == nil {
return 0
}
return v.Last.Segment.Stop - v.First.Segment.Start
}
func pushLinkLabelState(pc Context, v *linkLabelState) {
tlist := pc.Get(linkLabelStateKey)
var list *linkLabelState
@ -113,19 +119,20 @@ func (s *linkParser) Trigger() []byte {
return []byte{'!', '[', ']'}
}
var linkDestinationRegexp = regexp.MustCompile(`\s*([^\s].+)`)
var linkTitleRegexp = regexp.MustCompile(`\s+(\)|["'\(].+)`)
var linkBottom = NewContextKey()
func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.Node {
line, segment := block.PeekLine()
if line[0] == '!' && len(line) > 1 && line[1] == '[' {
block.Advance(1)
pc.Set(linkBottom, pc.LastDelimiter())
return processLinkLabelOpen(block, segment.Start+1, true, pc)
if line[0] == '!' {
if len(line) > 1 && line[1] == '[' {
block.Advance(1)
pushLinkBottom(pc)
return processLinkLabelOpen(block, segment.Start+1, true, pc)
}
return nil
}
if line[0] == '[' {
pc.Set(linkBottom, pc.LastDelimiter())
pushLinkBottom(pc)
return processLinkLabelOpen(block, segment.Start, false, pc)
}
@ -136,17 +143,22 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
}
last := tlist.(*linkLabelState).Last
if last == nil {
_ = popLinkBottom(pc)
return nil
}
block.Advance(1)
removeLinkLabelState(pc, last)
if s.containsLink(last) { // a link in a link text is not allowed
// CommonMark spec says:
// > A link label can have at most 999 characters inside the square brackets.
if linkLabelStateLength(tlist.(*linkLabelState)) > 998 {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil
}
labelValue := block.Value(text.NewSegment(last.Segment.Start+1, segment.Start))
if util.IsBlank(labelValue) && !last.IsImage {
if !last.IsImage && s.containsLink(last) { // a link in a link text is not allowed
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil
}
@ -160,6 +172,7 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
link, hasValue = s.parseReferenceLink(parent, last, block, pc)
if link == nil && hasValue {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil
}
}
@ -169,9 +182,18 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
block.SetPosition(l, pos)
ssegment := text.NewSegment(last.Segment.Stop, segment.Start)
maybeReference := block.Value(ssegment)
ref, ok := pc.Root().Reference(util.ToLinkReference(maybeReference))
// CommonMark spec says:
// > A link label can have at most 999 characters inside the square brackets.
if len(maybeReference) > 999 {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil
}
ref, ok := pc.Reference(util.ToLinkReference(maybeReference))
if !ok {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil
}
link = ast.NewLink()
@ -187,15 +209,17 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
return link
}
func (s *linkParser) containsLink(last *linkLabelState) bool {
if last.IsImage {
func (s *linkParser) containsLink(n ast.Node) bool {
if n == nil {
return false
}
var c ast.Node
for c = last; c != nil; c = c.NextSibling() {
for c := n; c != nil; c = c.NextSibling() {
if _, ok := c.(*ast.Link); ok {
return true
}
if s.containsLink(c.FirstChild()) {
return true
}
}
return false
}
@ -212,11 +236,7 @@ func processLinkLabelOpen(block text.Reader, pos int, isImage bool, pc Context)
}
func (s *linkParser) processLinkLabel(parent ast.Node, link *ast.Link, last *linkLabelState, pc Context) {
var bottom ast.Node
if v := pc.Get(linkBottom); v != nil {
bottom = v.(ast.Node)
}
pc.Set(linkBottom, nil)
bottom := popLinkBottom(pc)
ProcessDelimiters(bottom, pc)
for c := last.NextSibling(); c != nil; {
next := c.NextSibling()
@ -226,24 +246,42 @@ func (s *linkParser) processLinkLabel(parent ast.Node, link *ast.Link, last *lin
}
}
func (s *linkParser) parseReferenceLink(parent ast.Node, last *linkLabelState, block text.Reader, pc Context) (*ast.Link, bool) {
var linkFindClosureOptions text.FindClosureOptions = text.FindClosureOptions{
Nesting: false,
Newline: true,
Advance: true,
}
func (s *linkParser) parseReferenceLink(parent ast.Node, last *linkLabelState,
block text.Reader, pc Context) (*ast.Link, bool) {
_, orgpos := block.Position()
block.Advance(1) // skip '['
line, segment := block.PeekLine()
endIndex := util.FindClosure(line, '[', ']', false, true)
if endIndex < 0 {
segments, found := block.FindClosure('[', ']', linkFindClosureOptions)
if !found {
return nil, false
}
block.Advance(endIndex + 1)
ssegment := segment.WithStop(segment.Start + endIndex)
maybeReference := block.Value(ssegment)
var maybeReference []byte
if segments.Len() == 1 { // avoid allocate a new byte slice
maybeReference = block.Value(segments.At(0))
} else {
maybeReference = []byte{}
for i := 0; i < segments.Len(); i++ {
s := segments.At(i)
maybeReference = append(maybeReference, block.Value(s)...)
}
}
if util.IsBlank(maybeReference) { // collapsed reference link
ssegment = text.NewSegment(last.Segment.Stop, orgpos.Start-1)
maybeReference = block.Value(ssegment)
s := text.NewSegment(last.Segment.Stop, orgpos.Start-1)
maybeReference = block.Value(s)
}
// CommonMark spec says:
// > A link label can have at most 999 characters inside the square brackets.
if len(maybeReference) > 999 {
return nil, true
}
ref, ok := pc.Root().Reference(util.ToLinkReference(maybeReference))
ref, ok := pc.Reference(util.ToLinkReference(maybeReference))
if !ok {
return nil, true
}
@ -295,20 +333,17 @@ func (s *linkParser) parseLink(parent ast.Node, last *linkLabelState, block text
func parseLinkDestination(block text.Reader) ([]byte, bool) {
block.SkipSpaces()
line, _ := block.PeekLine()
buf := []byte{}
if block.Peek() == '<' {
i := 1
for i < len(line) {
c := line[i]
if c == '\\' && i < len(line)-1 && util.IsPunct(line[i+1]) {
buf = append(buf, '\\', line[i+1])
i += 2
continue
} else if c == '>' {
block.Advance(i + 1)
return line[1:i], true
}
buf = append(buf, c)
i++
}
return nil, false
@ -318,7 +353,6 @@ func parseLinkDestination(block text.Reader) ([]byte, bool) {
for i < len(line) {
c := line[i]
if c == '\\' && i < len(line)-1 && util.IsPunct(line[i+1]) {
buf = append(buf, '\\', line[i+1])
i += 2
continue
} else if c == '(' {
@ -331,7 +365,6 @@ func parseLinkDestination(block text.Reader) ([]byte, bool) {
} else if util.IsSpace(c) {
break
}
buf = append(buf, c)
i++
}
block.Advance(i)
@ -348,17 +381,61 @@ func parseLinkTitle(block text.Reader) ([]byte, bool) {
if opener == '(' {
closer = ')'
}
line, _ := block.PeekLine()
pos := util.FindClosure(line[1:], opener, closer, false, true)
if pos < 0 {
return nil, false
block.Advance(1)
segments, found := block.FindClosure(opener, closer, linkFindClosureOptions)
if found {
if segments.Len() == 1 {
return block.Value(segments.At(0)), true
}
var title []byte
for i := 0; i < segments.Len(); i++ {
s := segments.At(i)
title = append(title, block.Value(s)...)
}
return title, true
}
pos += 2 // opener + closer
block.Advance(pos)
return line[1 : pos-1], true
return nil, false
}
func pushLinkBottom(pc Context) {
bottoms := pc.Get(linkBottom)
b := pc.LastDelimiter()
if bottoms == nil {
pc.Set(linkBottom, b)
return
}
if s, ok := bottoms.([]ast.Node); ok {
pc.Set(linkBottom, append(s, b))
return
}
pc.Set(linkBottom, []ast.Node{bottoms.(ast.Node), b})
}
func popLinkBottom(pc Context) ast.Node {
bottoms := pc.Get(linkBottom)
if bottoms == nil {
return nil
}
if v, ok := bottoms.(ast.Node); ok {
pc.Set(linkBottom, nil)
return v
}
s := bottoms.([]ast.Node)
v := s[len(s)-1]
n := s[0 : len(s)-1]
switch len(n) {
case 0:
pc.Set(linkBottom, nil)
case 1:
pc.Set(linkBottom, n[0])
default:
pc.Set(linkBottom, s[0:len(s)-1])
}
return v
}
func (s *linkParser) CloseBlock(parent ast.Node, block text.Reader, pc Context) {
pc.Set(linkBottom, nil)
tlist := pc.Get(linkLabelStateKey)
if tlist == nil {
return

View file

@ -52,7 +52,7 @@ func (p *linkReferenceParagraphTransformer) Transform(node *ast.Paragraph, reade
func parseLinkReferenceDefinition(block text.Reader, pc Context) (int, int) {
block.SkipSpaces()
line, segment := block.PeekLine()
line, _ := block.PeekLine()
if line == nil {
return -1, -1
}
@ -67,39 +67,33 @@ func parseLinkReferenceDefinition(block text.Reader, pc Context) (int, int) {
if line[pos] != '[' {
return -1, -1
}
open := segment.Start + pos + 1
closes := -1
block.Advance(pos + 1)
for {
line, segment = block.PeekLine()
if line == nil {
return -1, -1
}
closure := util.FindClosure(line, '[', ']', false, false)
if closure > -1 {
closes = segment.Start + closure
next := closure + 1
if next >= len(line) || line[next] != ':' {
return -1, -1
}
block.Advance(next + 1)
break
}
block.AdvanceLine()
}
if closes < 0 {
segments, found := block.FindClosure('[', ']', linkFindClosureOptions)
if !found {
return -1, -1
}
label := block.Value(text.NewSegment(open, closes))
var label []byte
if segments.Len() == 1 {
label = block.Value(segments.At(0))
} else {
for i := 0; i < segments.Len(); i++ {
s := segments.At(i)
label = append(label, block.Value(s)...)
}
}
if util.IsBlank(label) {
return -1, -1
}
if block.Peek() != ':' {
return -1, -1
}
block.Advance(1)
block.SkipSpaces()
destination, ok := parseLinkDestination(block)
if !ok {
return -1, -1
}
line, segment = block.PeekLine()
line, _ = block.PeekLine()
isNewLine := line == nil || util.IsBlank(line)
endLine, _ := block.Position()
@ -117,45 +111,40 @@ func parseLinkReferenceDefinition(block text.Reader, pc Context) (int, int) {
return -1, -1
}
block.Advance(1)
open = -1
closes = -1
closer := opener
if opener == '(' {
closer = ')'
}
for {
line, segment = block.PeekLine()
if line == nil {
segments, found = block.FindClosure(opener, closer, linkFindClosureOptions)
if !found {
if !isNewLine {
return -1, -1
}
if open < 0 {
open = segment.Start
}
closure := util.FindClosure(line, opener, closer, false, true)
if closure > -1 {
closes = segment.Start + closure
block.Advance(closure + 1)
break
}
ref := NewReference(label, destination, nil)
pc.AddReference(ref)
block.AdvanceLine()
return startLine, endLine + 1
}
if closes < 0 {
return -1, -1
var title []byte
if segments.Len() == 1 {
title = block.Value(segments.At(0))
} else {
for i := 0; i < segments.Len(); i++ {
s := segments.At(i)
title = append(title, block.Value(s)...)
}
}
line, segment = block.PeekLine()
line, _ = block.PeekLine()
if line != nil && !util.IsBlank(line) {
if !isNewLine {
return -1, -1
}
title := block.Value(text.NewSegment(open, closes))
ref := NewReference(label, destination, title)
pc.AddReference(ref)
return startLine, endLine
}
title := block.Value(text.NewSegment(open, closes))
endLine, _ = block.Position()
ref := NewReference(label, destination, title)
pc.AddReference(ref)

View file

@ -1,10 +1,11 @@
package parser
import (
"strconv"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"strconv"
)
type listItemType int
@ -15,9 +16,13 @@ const (
orderedList
)
var skipListParserKey = NewContextKey()
var emptyListItemWithBlankLines = NewContextKey()
var listItemFlagValue interface{} = true
// Same as
// `^(([ ]*)([\-\*\+]))(\s+.*)?\n?$`.FindSubmatchIndex or
// `^(([ ]*)(\d{1,9}[\.\)]))(\s+.*)?\n?$`.FindSubmatchIndex
// `^(([ ]*)(\d{1,9}[\.\)]))(\s+.*)?\n?$`.FindSubmatchIndex.
func parseListItem(line []byte) ([6]int, listItemType) {
i := 0
l := len(line)
@ -84,7 +89,7 @@ func matchesListItem(source []byte, strict bool) ([6]int, listItemType) {
}
func calcListOffset(source []byte, match [6]int) int {
offset := 0
var offset int
if match[4] < 0 || util.IsBlank(source[match[4]:]) { // list item starts with a blank line
offset = 1
} else {
@ -122,8 +127,8 @@ func (b *listParser) Trigger() []byte {
func (b *listParser) Open(parent ast.Node, reader text.Reader, pc Context) (ast.Node, State) {
last := pc.LastOpenedBlock().Node
if _, lok := last.(*ast.List); lok || pc.Get(skipListParser) != nil {
pc.Set(skipListParser, nil)
if _, lok := last.(*ast.List); lok || pc.Get(skipListParserKey) != nil {
pc.Set(skipListParserKey, nil)
return nil, NoChildren
}
line, _ := reader.PeekLine()
@ -143,7 +148,7 @@ func (b *listParser) Open(parent ast.Node, reader text.Reader, pc Context) (ast.
return nil, NoChildren
}
//an empty list item cannot interrupt a paragraph:
if match[5]-match[4] == 1 {
if match[4] < 0 || util.IsBlank(line[match[4]:match[5]]) {
return nil, NoChildren
}
}
@ -153,6 +158,7 @@ func (b *listParser) Open(parent ast.Node, reader text.Reader, pc Context) (ast.
if start > -1 {
node.Start = start
}
pc.Set(emptyListItemWithBlankLines, nil)
return node, HasChildren
}
@ -160,26 +166,11 @@ func (b *listParser) Continue(node ast.Node, reader text.Reader, pc Context) Sta
list := node.(*ast.List)
line, _ := reader.PeekLine()
if util.IsBlank(line) {
// A list item can begin with at most one blank line
if node.ChildCount() == 1 && node.LastChild().ChildCount() == 0 {
return Close
if node.LastChild().ChildCount() == 0 {
pc.Set(emptyListItemWithBlankLines, listItemFlagValue)
}
return Continue | HasChildren
}
// Thematic Breaks take precedence over lists
if isThematicBreak(line) {
isHeading := false
last := pc.LastOpenedBlock().Node
if ast.IsParagraph(last) {
c, ok := matchesSetextHeadingBar(line)
if ok && c == '-' {
isHeading = true
}
}
if !isHeading {
return Close
}
}
// "offset" means a width that bar indicates.
// - aaaaaaaa
@ -189,10 +180,23 @@ func (b *listParser) Continue(node ast.Node, reader text.Reader, pc Context) Sta
// - a
// - b <--- current line
// it maybe a new child of the list.
//
// Empty list items can have multiple blanklines
//
// - <--- 1st item is an empty thus "offset" is unknown
//
//
// - <--- current line
//
// -> 1 list with 2 blank items
//
// So if the last item is an empty, it maybe a new child of the list.
//
offset := lastOffset(node)
indent, _ := util.IndentWidth(line, 0)
lastIsEmpty := node.LastChild().ChildCount() == 0
indent, _ := util.IndentWidth(line, reader.LineOffset())
if indent < offset {
if indent < offset || lastIsEmpty {
if indent < 4 {
match, typ := matchesListItem(line, false) // may have a leading spaces more than 3
if typ != notList && match[1]-offset < 4 {
@ -200,9 +204,41 @@ func (b *listParser) Continue(node ast.Node, reader text.Reader, pc Context) Sta
if !list.CanContinue(marker, typ == orderedList) {
return Close
}
// Thematic Breaks take precedence over lists
if isThematicBreak(line[match[3]-1:], 0) {
isHeading := false
last := pc.LastOpenedBlock().Node
if ast.IsParagraph(last) {
c, ok := matchesSetextHeadingBar(line[match[3]-1:])
if ok && c == '-' {
isHeading = true
}
}
if !isHeading {
return Close
}
}
return Continue | HasChildren
}
}
if !lastIsEmpty {
return Close
}
}
if lastIsEmpty && indent < offset {
return Close
}
// Non empty items can not exist next to an empty list item
// with blank lines. So we need to close the current list
//
// -
//
// foo
//
// -> 1 list with 1 blank items and 1 paragraph
if pc.Get(emptyListItemWithBlankLines) != nil {
return Close
}
return Continue | HasChildren
@ -214,14 +250,14 @@ func (b *listParser) Close(node ast.Node, reader text.Reader, pc Context) {
for c := node.FirstChild(); c != nil && list.IsTight; c = c.NextSibling() {
if c.FirstChild() != nil && c.FirstChild() != c.LastChild() {
for c1 := c.FirstChild().NextSibling(); c1 != nil; c1 = c1.NextSibling() {
if bl, ok := c1.(ast.Node); ok && bl.HasBlankPreviousLines() {
if c1.HasBlankPreviousLines() {
list.IsTight = false
break
}
}
}
if c != node.FirstChild() {
if bl, ok := c.(ast.Node); ok && bl.HasBlankPreviousLines() {
if c.HasBlankPreviousLines() {
list.IsTight = false
}
}
@ -229,8 +265,9 @@ func (b *listParser) Close(node ast.Node, reader text.Reader, pc Context) {
if list.IsTight {
for child := node.FirstChild(); child != nil; child = child.NextSibling() {
for gc := child.FirstChild(); gc != nil; gc = gc.NextSibling() {
for gc := child.FirstChild(); gc != nil; {
paragraph, ok := gc.(*ast.Paragraph)
gc = gc.NextSibling()
if ok {
textBlock := ast.NewTextBlock()
textBlock.SetLines(paragraph.Lines())

View file

@ -17,9 +17,6 @@ func NewListItemParser() BlockParser {
return defaultListItemParser
}
var skipListParser = NewContextKey()
var skipListParserValue interface{} = true
func (b *listItemParser) Trigger() []byte {
return []byte{'-', '+', '*', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9'}
}
@ -38,9 +35,12 @@ func (b *listItemParser) Open(parent ast.Node, reader text.Reader, pc Context) (
if match[1]-offset > 3 {
return nil, NoChildren
}
pc.Set(emptyListItemWithBlankLines, nil)
itemOffset := calcListOffset(line, match)
node := ast.NewListItem(match[3] + itemOffset)
if match[4] < 0 || match[5]-match[4] == 1 {
if match[4] < 0 || util.IsBlank(line[match[4]:match[5]]) {
return node, NoChildren
}
@ -53,18 +53,23 @@ func (b *listItemParser) Open(parent ast.Node, reader text.Reader, pc Context) (
func (b *listItemParser) Continue(node ast.Node, reader text.Reader, pc Context) State {
line, _ := reader.PeekLine()
if util.IsBlank(line) {
reader.Advance(len(line) - 1)
return Continue | HasChildren
}
indent, _ := util.IndentWidth(line, reader.LineOffset())
offset := lastOffset(node.Parent())
if indent < offset && indent < 4 {
isEmpty := node.ChildCount() == 0 && pc.Get(emptyListItemWithBlankLines) != nil
indent, _ := util.IndentWidth(line, reader.LineOffset())
if (isEmpty || indent < offset) && indent < 4 {
_, typ := matchesListItem(line, true)
// new list item found
if typ != notList {
pc.Set(skipListParser, skipListParserValue)
pc.Set(skipListParserKey, listItemFlagValue)
return Close
}
if !isEmpty {
return Close
}
return Close
}
pos, padding := util.IndentPosition(line, reader.LineOffset(), offset)
reader.AdvanceAndSetPadding(pos, padding)

View file

@ -3,6 +3,7 @@ package parser
import (
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
)
type paragraphParser struct {
@ -33,9 +34,8 @@ func (b *paragraphParser) Open(parent ast.Node, reader text.Reader, pc Context)
}
func (b *paragraphParser) Continue(node ast.Node, reader text.Reader, pc Context) State {
_, segment := reader.PeekLine()
segment = segment.TrimLeftSpace(reader.Source())
if segment.IsEmpty() {
line, segment := reader.PeekLine()
if util.IsBlank(line) {
return Close
}
node.Lines().Append(segment)
@ -44,13 +44,14 @@ func (b *paragraphParser) Continue(node ast.Node, reader text.Reader, pc Context
}
func (b *paragraphParser) Close(node ast.Node, reader text.Reader, pc Context) {
parent := node.Parent()
if parent == nil {
// paragraph has been transformed
return
}
lines := node.Lines()
if lines.Len() != 0 {
// trim leading spaces
for i := 0; i < lines.Len(); i++ {
l := lines.At(i)
lines.Set(i, l.TrimLeftSpace(reader.Source()))
}
// trim trailing spaces
length := lines.Len()
lastLine := node.Lines().At(length - 1)

View file

@ -56,7 +56,7 @@ func (r *reference) String() string {
// An IDs interface is a collection of the element ids.
type IDs interface {
// Generate generates a new element id.
Generate(value, prefix []byte) []byte
Generate(value []byte, kind ast.NodeKind) []byte
// Put puts a given element id to the used ids table.
Put(value []byte)
@ -72,7 +72,7 @@ func newIDs() IDs {
}
}
func (s *ids) Generate(value, prefix []byte) []byte {
func (s *ids) Generate(value []byte, kind ast.NodeKind) []byte {
value = util.TrimLeftSpace(value)
value = util.TrimRightSpace(value)
result := []byte{}
@ -88,13 +88,13 @@ func (s *ids) Generate(value, prefix []byte) []byte {
v += 'a' - 'A'
}
result = append(result, v)
} else if util.IsSpace(v) {
} else if util.IsSpace(v) || v == '-' || v == '_' {
result = append(result, '-')
}
}
if len(result) == 0 {
if prefix != nil {
result = append(make([]byte, 0, len(prefix)), prefix...)
if kind == ast.KindHeading {
result = []byte("heading")
} else {
result = []byte("id")
}
@ -104,7 +104,7 @@ func (s *ids) Generate(value, prefix []byte) []byte {
return result
}
for i := 1; ; i++ {
newResult := fmt.Sprintf("%s%d", result, i)
newResult := fmt.Sprintf("%s-%d", result, i)
if _, ok := s.values[newResult]; !ok {
s.values[newResult] = true
return []byte(newResult)
@ -138,6 +138,9 @@ type Context interface {
// Get returns a value associated with the given key.
Get(ContextKey) interface{}
// ComputeIfAbsent computes a value if a value associated with the given key is absent and returns the value.
ComputeIfAbsent(ContextKey, func() interface{}) interface{}
// Set sets the given value to the context.
Set(ContextKey, interface{})
@ -197,8 +200,23 @@ type Context interface {
// LastOpenedBlock returns a last node that is currently in parsing.
LastOpenedBlock() Block
// Root returns a context shared accross goroutines.
Root() Context
// IsInLinkLabel returns true if current position seems to be in link label.
IsInLinkLabel() bool
}
// A ContextConfig struct is a data structure that holds configuration of the Context.
type ContextConfig struct {
IDs IDs
}
// An ContextOption is a functional option type for the Context.
type ContextOption func(*ContextConfig)
// WithIDs is a functional option for the Context.
func WithIDs(ids IDs) ContextOption {
return func(c *ContextConfig) {
c.IDs = ids
}
}
type parseContext struct {
@ -210,21 +228,26 @@ type parseContext struct {
delimiters *Delimiter
lastDelimiter *Delimiter
openedBlocks []Block
root Context
}
// NewContext returns a new Context.
func NewContext() Context {
func NewContext(options ...ContextOption) Context {
cfg := &ContextConfig{
IDs: newIDs(),
}
for _, option := range options {
option(cfg)
}
return &parseContext{
store: make([]interface{}, ContextKeyMax+1),
refs: map[string]Reference{},
ids: newIDs(),
ids: cfg.IDs,
blockOffset: -1,
blockIndent: -1,
delimiters: nil,
lastDelimiter: nil,
openedBlocks: []Block{},
root: nil,
}
}
@ -232,6 +255,15 @@ func (p *parseContext) Get(key ContextKey) interface{} {
return p.store[key]
}
func (p *parseContext) ComputeIfAbsent(key ContextKey, f func() interface{}) interface{} {
v := p.store[key]
if v == nil {
v = f()
p.store[key] = v
}
return v
}
func (p *parseContext) Set(key ContextKey, value interface{}) {
p.store[key] = value
}
@ -361,138 +393,9 @@ func (p *parseContext) LastOpenedBlock() Block {
return Block{}
}
func (p *parseContext) Root() Context {
if p.root == nil {
return p
}
return p.root
}
type concurrentParseContext struct {
delegate Context
m sync.RWMutex
root Context
}
func NewConcurrentContext(delegate Context) Context {
return &concurrentParseContext{
delegate: delegate,
root: nil,
}
}
func (p *concurrentParseContext) Get(key ContextKey) interface{} {
p.m.RLock()
defer p.m.RUnlock()
ret := p.delegate.Get(key)
return ret
}
func (p *concurrentParseContext) Set(key ContextKey, value interface{}) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.Set(key, value)
}
func (p *concurrentParseContext) IDs() IDs {
return p.delegate.IDs()
}
func (p *concurrentParseContext) BlockOffset() int {
return p.delegate.BlockOffset()
}
func (p *concurrentParseContext) SetBlockOffset(v int) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.SetBlockOffset(v)
}
func (p *concurrentParseContext) BlockIndent() int {
return p.delegate.BlockIndent()
}
func (p *concurrentParseContext) SetBlockIndent(v int) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.SetBlockIndent(v)
}
func (p *concurrentParseContext) LastDelimiter() *Delimiter {
return p.delegate.LastDelimiter()
}
func (p *concurrentParseContext) FirstDelimiter() *Delimiter {
return p.delegate.FirstDelimiter()
}
func (p *concurrentParseContext) PushDelimiter(d *Delimiter) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.PushDelimiter(d)
}
func (p *concurrentParseContext) RemoveDelimiter(d *Delimiter) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.RemoveDelimiter(d)
}
func (p *concurrentParseContext) ClearDelimiters(bottom ast.Node) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.ClearDelimiters(bottom)
}
func (p *concurrentParseContext) AddReference(ref Reference) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.AddReference(ref)
}
func (p *concurrentParseContext) Reference(label string) (Reference, bool) {
p.m.RLock()
defer p.m.RUnlock()
v, ok := p.delegate.Reference(label)
return v, ok
}
func (p *concurrentParseContext) References() []Reference {
p.m.RLock()
defer p.m.RUnlock()
ret := p.delegate.References()
return ret
}
func (p *concurrentParseContext) String() string {
p.m.RLock()
defer p.m.RUnlock()
ret := p.delegate.String()
return ret
}
func (p *concurrentParseContext) OpenedBlocks() []Block {
return p.delegate.OpenedBlocks()
}
func (p *concurrentParseContext) SetOpenedBlocks(v []Block) {
p.m.Lock()
defer p.m.Unlock()
p.delegate.SetOpenedBlocks(v)
}
func (p *concurrentParseContext) LastOpenedBlock() Block {
p.m.RLock()
defer p.m.RUnlock()
ret := p.delegate.LastOpenedBlock()
return ret
}
func (p *concurrentParseContext) Root() Context {
if p.root == nil {
return p
}
return p.root
func (p *parseContext) IsInLinkLabel() bool {
tlist := p.Get(linkLabelStateKey)
return tlist != nil
}
// State represents parser's state.
@ -500,7 +403,8 @@ func (p *concurrentParseContext) Root() Context {
type State int
const (
none State = 1 << iota
// None is a default value of the [State].
None State = 1 << iota
// Continue indicates parser can continue parsing.
Continue
@ -527,6 +431,7 @@ type Config struct {
InlineParsers util.PrioritizedSlice /*<InlineParser>*/
ParagraphTransformers util.PrioritizedSlice /*<ParagraphTransformer>*/
ASTTransformers util.PrioritizedSlice /*<ASTTransformer>*/
EscapedSpace bool
}
// NewConfig returns a new Config.
@ -568,7 +473,7 @@ type Parser interface {
// Parse parses the given Markdown text into AST nodes.
Parse(reader text.Reader, opts ...ParseOption) ast.Node
// AddOption adds the given option to thie parser.
// AddOption adds the given option to this parser.
AddOptions(...Option)
}
@ -614,7 +519,7 @@ type BlockParser interface {
// Close will be called when the parser returns Close.
Close(node ast.Node, reader text.Reader, pc Context)
// CanInterruptParagraph returns true if the parser can interrupt pargraphs,
// CanInterruptParagraph returns true if the parser can interrupt paragraphs,
// otherwise false.
CanInterruptParagraph() bool
@ -663,16 +568,16 @@ type ASTTransformer interface {
// DefaultBlockParsers returns a new list of default BlockParsers.
// Priorities of default BlockParsers are:
//
// SetextHeadingParser, 100
// ThematicBreakParser, 200
// ListParser, 300
// ListItemParser, 400
// CodeBlockParser, 500
// ATXHeadingParser, 600
// FencedCodeBlockParser, 700
// BlockquoteParser, 800
// HTMLBlockParser, 900
// ParagraphParser, 1000
// SetextHeadingParser, 100
// ThematicBreakParser, 200
// ListParser, 300
// ListItemParser, 400
// CodeBlockParser, 500
// ATXHeadingParser, 600
// FencedCodeBlockParser, 700
// BlockquoteParser, 800
// HTMLBlockParser, 900
// ParagraphParser, 1000
func DefaultBlockParsers() []util.PrioritizedValue {
return []util.PrioritizedValue{
util.Prioritized(NewSetextHeadingParser(), 100),
@ -691,11 +596,11 @@ func DefaultBlockParsers() []util.PrioritizedValue {
// DefaultInlineParsers returns a new list of default InlineParsers.
// Priorities of default InlineParsers are:
//
// CodeSpanParser, 100
// LinkParser, 200
// AutoLinkParser, 300
// RawHTMLParser, 400
// EmphasisParser, 500
// CodeSpanParser, 100
// LinkParser, 200
// AutoLinkParser, 300
// RawHTMLParser, 400
// EmphasisParser, 500
func DefaultInlineParsers() []util.PrioritizedValue {
return []util.PrioritizedValue{
util.Prioritized(NewCodeSpanParser(), 100),
@ -709,7 +614,7 @@ func DefaultInlineParsers() []util.PrioritizedValue {
// DefaultParagraphTransformers returns a new list of default ParagraphTransformers.
// Priorities of default ParagraphTransformers are:
//
// LinkReferenceParagraphTransformer, 100
// LinkReferenceParagraphTransformer, 100
func DefaultParagraphTransformers() []util.PrioritizedValue {
return []util.PrioritizedValue{
util.Prioritized(LinkReferenceParagraphTransformer, 100),
@ -732,6 +637,7 @@ type parser struct {
closeBlockers []CloseBlocker
paragraphTransformers []ParagraphTransformer
astTransformers []ASTTransformer
escapedSpace bool
config *Config
initSync sync.Once
}
@ -792,6 +698,18 @@ func WithASTTransformers(ps ...util.PrioritizedValue) Option {
return &withASTTransformers{ps}
}
type withEscapedSpace struct {
}
func (o *withEscapedSpace) SetParserOption(c *Config) {
c.EscapedSpace = true
}
// WithEscapedSpace is a functional option indicates that a '\' escaped half-space(0x20) should not trigger parsers.
func WithEscapedSpace() Option {
return &withEscapedSpace{}
}
type withOption struct {
name OptionName
value interface{}
@ -906,7 +824,6 @@ func (p *parser) addASTTransformer(v util.PrioritizedValue, options map[OptionNa
// A ParseConfig struct is a data structure that holds configuration of the Parser.Parse.
type ParseConfig struct {
Context Context
Workers int
}
// A ParseOption is a functional option type for the Parser.Parse.
@ -920,15 +837,6 @@ func WithContext(context Context) ParseOption {
}
}
// WithWorkers is a functional option that allow you to set
// number of inline parsing workers(goroutines).
// If num is 0, inline parsing will never be multithreaded.
func WithWorkers(num int) ParseOption {
return func(c *ParseConfig) {
c.Workers = num
}
}
func (p *parser) Parse(reader text.Reader, opts ...ParseOption) ast.Node {
p.initSync.Do(func() {
p.config.BlockParsers.Sort()
@ -953,6 +861,7 @@ func (p *parser) Parse(reader text.Reader, opts ...ParseOption) ast.Node {
for _, v := range p.config.ASTTransformers {
p.addASTTransformer(v, p.config.Options)
}
p.escapedSpace = p.config.EscapedSpace
p.config = nil
})
c := &ParseConfig{}
@ -966,49 +875,15 @@ func (p *parser) Parse(reader text.Reader, opts ...ParseOption) ast.Node {
root := ast.NewDocument()
p.parseBlocks(root, reader, pc)
if c.Workers < 2 {
blockReader := text.NewBlockReader(reader.Source(), nil)
p.walkBlock(root, func(node ast.Node) {
p.parseBlock(blockReader, node, pc)
})
} else {
nodes := make([]ast.Node, 0, 100)
p.walkBlock(root, func(node ast.Node) {
nodes = append(nodes, node)
})
max := (len(nodes) / c.Workers) - 1
if max < 0 {
blockReader := text.NewBlockReader(reader.Source(), nil)
p.walkBlock(root, func(node ast.Node) {
p.parseBlock(blockReader, node, pc)
})
} else {
rootContext := NewConcurrentContext(pc)
var wg sync.WaitGroup
for i := 0; i <= max; i++ {
from := i * c.Workers
to := from + c.Workers
if i == max {
to = len(nodes)
}
wg.Add(1)
go func(wg *sync.WaitGroup) {
blockReader := text.NewBlockReader(reader.Source(), nil)
pc := NewContext()
pc.(*parseContext).root = rootContext
for _, n := range nodes[from:to] {
p.parseBlock(blockReader, n, pc)
}
wg.Done()
}(&wg)
}
wg.Wait()
}
}
blockReader := text.NewBlockReader(reader.Source(), nil)
p.walkBlock(root, func(node ast.Node) {
p.parseBlock(blockReader, node, pc)
})
for _, at := range p.astTransformers {
at.Transform(root, reader, pc)
}
//root.Dump(reader.Source(), 0)
// root.Dump(reader.Source(), 0)
return root
}
@ -1026,11 +901,13 @@ func (p *parser) closeBlocks(from, to int, reader text.Reader, pc Context) {
blocks := pc.OpenedBlocks()
for i := from; i >= to; i-- {
node := blocks[i].Node
blocks[i].Parser.Close(blocks[i].Node, reader, pc)
paragraph, ok := node.(*ast.Paragraph)
if ok && node.Parent() != nil {
p.transformParagraph(paragraph, reader, pc)
}
if node.Parent() != nil { // closes only if node has not been transformed
blocks[i].Parser.Close(blocks[i].Node, reader, pc)
}
}
if from == len(blocks)-1 {
blocks = blocks[0:to]
@ -1058,7 +935,7 @@ func (p *parser) openBlocks(parent ast.Node, blankLine bool, reader text.Reader,
retry:
var bps []BlockParser
line, _ := reader.PeekLine()
w, pos := util.IndentWidth(line, 0)
w, pos := util.IndentWidth(line, reader.LineOffset())
if w >= len(line) {
pc.SetBlockOffset(-1)
pc.SetBlockIndent(-1)
@ -1087,7 +964,7 @@ retry:
if w > 3 && !bp.CanAcceptIndentedLine() {
continue
}
lastBlock := pc.LastOpenedBlock()
lastBlock = pc.LastOpenedBlock()
last := lastBlock.Node
node, state := bp.Open(parent, reader, pc)
if node != nil {
@ -1153,8 +1030,9 @@ type lineStat struct {
}
func isBlankLine(lineNum, level int, stats []lineStat) bool {
ret := false
ret := true
for i := len(stats) - 1 - level; i >= 0; i-- {
ret = false
s := stats[i]
if s.lineNum == lineNum {
if s.level < level && s.isBlank {
@ -1173,7 +1051,7 @@ func isBlankLine(lineNum, level int, stats []lineStat) bool {
func (p *parser) parseBlocks(parent ast.Node, reader text.Reader, pc Context) {
pc.SetOpenedBlocks([]Block{})
blankLines := make([]lineStat, 0, 128)
isBlank := false
var isBlank bool
for { // process blocks separated by blank lines
_, lines, ok := reader.SkipBlankLines()
if !ok {
@ -1256,6 +1134,12 @@ func (p *parser) walkBlock(block ast.Node, cb func(node ast.Node)) {
cb(block)
}
const (
lineBreakHard uint8 = 1 << iota
lineBreakSoft
lineBreakVisible
)
func (p *parser) parseBlock(block text.BlockReader, parent ast.Node, pc Context) {
if parent.IsRaw() {
return
@ -1270,18 +1154,42 @@ func (p *parser) parseBlock(block text.BlockReader, parent ast.Node, pc Context)
break
}
lineLength := len(line)
var lineBreakFlags uint8
hasNewLine := line[lineLength-1] == '\n'
if ((lineLength >= 3 && line[lineLength-2] == '\\' &&
line[lineLength-3] != '\\') || (lineLength == 2 && line[lineLength-2] == '\\')) && hasNewLine { // ends with \\n
lineLength -= 2
lineBreakFlags |= lineBreakHard | lineBreakVisible
} else if ((lineLength >= 4 && line[lineLength-3] == '\\' && line[lineLength-2] == '\r' &&
line[lineLength-4] != '\\') || (lineLength == 3 && line[lineLength-3] == '\\' && line[lineLength-2] == '\r')) &&
hasNewLine { // ends with \\r\n
lineLength -= 3
lineBreakFlags |= lineBreakHard | lineBreakVisible
} else if lineLength >= 3 && line[lineLength-3] == ' ' && line[lineLength-2] == ' ' &&
hasNewLine { // ends with [space][space]\n
lineLength -= 3
lineBreakFlags |= lineBreakHard
} else if lineLength >= 4 && line[lineLength-4] == ' ' && line[lineLength-3] == ' ' &&
line[lineLength-2] == '\r' && hasNewLine { // ends with [space][space]\r\n
lineLength -= 4
lineBreakFlags |= lineBreakHard
} else if hasNewLine {
// If the line ends with a newline character, but it is not a hardlineBreak, then it is a softLinebreak
// If the line ends with a hardlineBreak, then it cannot end with a softLinebreak
// See https://spec.commonmark.org/0.30/#soft-line-breaks
lineBreakFlags |= lineBreakSoft
}
l, startPosition := block.Position()
n := 0
softLinebreak := false
for i := 0; i < lineLength; i++ {
c := line[i]
if c == '\n' {
softLinebreak = true
break
}
isSpace := util.IsSpace(c)
isSpace := util.IsSpace(c) && c != '\r' && c != '\n'
isPunct := util.IsPunct(c)
if (isPunct && !escaped) || isSpace || i == 0 {
if (isPunct && !escaped) || isSpace && !(escaped && p.escapedSpace) || i == 0 {
parserChar := c
if isSpace || (i == 0 && !isPunct) {
parserChar = ' '
@ -1333,18 +1241,14 @@ func (p *parser) parseBlock(block text.BlockReader, parent ast.Node, pc Context)
continue
}
diff := startPosition.Between(currentPosition)
stop := diff.Stop
hardlineBreak := false
if lineLength > 2 && line[lineLength-2] == '\\' && softLinebreak { // ends with \\n
stop--
hardlineBreak = true
} else if lineLength > 3 && line[lineLength-3] == ' ' && line[lineLength-2] == ' ' && softLinebreak { // ends with [space][space]\n
hardlineBreak = true
var text *ast.Text
if lineBreakFlags&(lineBreakHard|lineBreakVisible) == lineBreakHard|lineBreakVisible {
text = ast.NewTextSegment(diff)
} else {
text = ast.NewTextSegment(diff.TrimRightSpace(source))
}
rest := diff.WithStop(stop)
text := ast.NewTextSegment(rest.TrimRightSpace(source))
text.SetSoftLineBreak(softLinebreak)
text.SetHardLineBreak(hardlineBreak)
text.SetSoftLineBreak(lineBreakFlags&lineBreakSoft != 0)
text.SetHardLineBreak(lineBreakFlags&lineBreakHard != 0)
parent.AppendChild(parent, text)
block.AdvanceLine()
}

View file

@ -2,10 +2,11 @@ package parser
import (
"bytes"
"regexp"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util"
"regexp"
)
type rawHTMLParser struct {
@ -14,7 +15,7 @@ type rawHTMLParser struct {
var defaultRawHTMLParser = &rawHTMLParser{}
// NewRawHTMLParser return a new InlineParser that can parse
// inline htmls
// inline htmls.
func NewRawHTMLParser() InlineParser {
return defaultRawHTMLParser
}
@ -31,43 +32,92 @@ func (s *rawHTMLParser) Parse(parent ast.Node, block text.Reader, pc Context) as
if len(line) > 2 && line[1] == '/' && util.IsAlphaNumeric(line[2]) {
return s.parseMultiLineRegexp(closeTagRegexp, block, pc)
}
if bytes.HasPrefix(line, []byte("<!--")) {
return s.parseMultiLineRegexp(commentRegexp, block, pc)
if bytes.HasPrefix(line, openComment) {
return s.parseComment(block, pc)
}
if bytes.HasPrefix(line, []byte("<?")) {
return s.parseSingleLineRegexp(processingInstructionRegexp, block, pc)
if bytes.HasPrefix(line, openProcessingInstruction) {
return s.parseUntil(block, closeProcessingInstruction, pc)
}
if len(line) > 2 && line[1] == '!' && line[2] >= 'A' && line[2] <= 'Z' {
return s.parseSingleLineRegexp(declRegexp, block, pc)
return s.parseUntil(block, closeDecl, pc)
}
if bytes.HasPrefix(line, []byte("<![CDATA[")) {
return s.parseMultiLineRegexp(cdataRegexp, block, pc)
if bytes.HasPrefix(line, openCDATA) {
return s.parseUntil(block, closeCDATA, pc)
}
return nil
}
var tagnamePattern = `([A-Za-z][A-Za-z0-9-]*)`
var attributePattern = `(?:\s+[a-zA-Z_:][a-zA-Z0-9:._-]*(?:\s*=\s*(?:[^\"'=<>` + "`" + `\x00-\x20]+|'[^']*'|"[^"]*"))?)`
var openTagRegexp = regexp.MustCompile("^<" + tagnamePattern + attributePattern + `*\s*/?>`)
var closeTagRegexp = regexp.MustCompile("^</" + tagnamePattern + `\s*>`)
var commentRegexp = regexp.MustCompile(`^<!---->|<!--(?:-?[^>-])(?:-?[^-])*-->`)
var processingInstructionRegexp = regexp.MustCompile(`^(?:<\?).*?(?:\?>)`)
var declRegexp = regexp.MustCompile(`^<![A-Z]+\s+[^>]*>`)
var cdataRegexp = regexp.MustCompile(`<!\[CDATA\[[\s\S]*?\]\]>`)
var spaceOrOneNewline = `(?:[ \t]|(?:\r\n|\n){0,1})`
var attributePattern = `(?:[\r\n \t]+[a-zA-Z_:][a-zA-Z0-9:._-]*(?:[\r\n \t]*=[\r\n \t]*(?:[^\"'=<>` + "`" + `\x00-\x20]+|'[^']*'|"[^"]*"))?)` //nolint:golint,lll
var openTagRegexp = regexp.MustCompile("^<" + tagnamePattern + attributePattern + `*` + spaceOrOneNewline + `*/?>`)
var closeTagRegexp = regexp.MustCompile("^</" + tagnamePattern + spaceOrOneNewline + `*>`)
func (s *rawHTMLParser) parseSingleLineRegexp(reg *regexp.Regexp, block text.Reader, pc Context) ast.Node {
line, segment := block.PeekLine()
match := reg.FindSubmatchIndex(line)
if match == nil {
return nil
}
var openProcessingInstruction = []byte("<?")
var closeProcessingInstruction = []byte("?>")
var openCDATA = []byte("<![CDATA[")
var closeCDATA = []byte("]]>")
var closeDecl = []byte(">")
var emptyComment1 = []byte("<!-->")
var emptyComment2 = []byte("<!--->")
var openComment = []byte("<!--")
var closeComment = []byte("-->")
func (s *rawHTMLParser) parseComment(block text.Reader, pc Context) ast.Node {
savedLine, savedSegment := block.Position()
node := ast.NewRawHTML()
node.Segments.Append(segment.WithStop(segment.Start + match[1]))
block.Advance(match[1])
return node
line, segment := block.PeekLine()
if bytes.HasPrefix(line, emptyComment1) {
node.Segments.Append(segment.WithStop(segment.Start + len(emptyComment1)))
block.Advance(len(emptyComment1))
return node
}
if bytes.HasPrefix(line, emptyComment2) {
node.Segments.Append(segment.WithStop(segment.Start + len(emptyComment2)))
block.Advance(len(emptyComment2))
return node
}
offset := len(openComment)
line = line[offset:]
for {
index := bytes.Index(line, closeComment)
if index > -1 {
node.Segments.Append(segment.WithStop(segment.Start + offset + index + len(closeComment)))
block.Advance(offset + index + len(closeComment))
return node
}
offset = 0
node.Segments.Append(segment)
block.AdvanceLine()
line, segment = block.PeekLine()
if line == nil {
break
}
}
block.SetPosition(savedLine, savedSegment)
return nil
}
var dummyMatch = [][]byte{}
func (s *rawHTMLParser) parseUntil(block text.Reader, closer []byte, pc Context) ast.Node {
savedLine, savedSegment := block.Position()
node := ast.NewRawHTML()
for {
line, segment := block.PeekLine()
if line == nil {
break
}
index := bytes.Index(line, closer)
if index > -1 {
node.Segments.Append(segment.WithStop(segment.Start + index + len(closer)))
block.Advance(index + len(closer))
return node
}
node.Segments.Append(segment)
block.AdvanceLine()
}
block.SetPosition(savedLine, savedSegment)
return nil
}
func (s *rawHTMLParser) parseMultiLineRegexp(reg *regexp.Regexp, block text.Reader, pc Context) ast.Node {
sline, ssegment := block.Position()
@ -94,15 +144,10 @@ func (s *rawHTMLParser) parseMultiLineRegexp(reg *regexp.Regexp, block text.Read
if l == eline {
block.Advance(end - start)
break
} else {
block.AdvanceLine()
}
block.AdvanceLine()
}
return node
}
return nil
}
func (s *rawHTMLParser) CloseBlock(parent ast.Node, pc Context) {
// nothing to do
}

View file

@ -91,7 +91,7 @@ func (b *setextHeadingParser) Close(node ast.Node, reader text.Reader, pc Contex
para.Lines().Append(segment)
heading.Parent().InsertAfter(heading.Parent(), heading, para)
} else {
next.(ast.Node).Lines().Unshift(segment)
next.Lines().Unshift(segment)
}
heading.Parent().RemoveChild(heading.Parent(), heading)
} else {
@ -108,9 +108,11 @@ func (b *setextHeadingParser) Close(node ast.Node, reader text.Reader, pc Contex
}
if b.AutoHeadingID {
_, ok := node.AttributeString("id")
id, ok := node.AttributeString("id")
if !ok {
generateAutoHeadingID(heading, reader, pc)
} else {
pc.IDs().Put(id.([]byte))
}
}
}

View file

@ -11,14 +11,14 @@ type thematicBreakPraser struct {
var defaultThematicBreakPraser = &thematicBreakPraser{}
// NewThematicBreakPraser returns a new BlockParser that
// NewThematicBreakParser returns a new BlockParser that
// parses thematic breaks.
func NewThematicBreakParser() BlockParser {
return defaultThematicBreakPraser
}
func isThematicBreak(line []byte) bool {
w, pos := util.IndentWidth(line, 0)
func isThematicBreak(line []byte, offset int) bool {
w, pos := util.IndentWidth(line, offset)
if w > 3 {
return false
}
@ -51,7 +51,7 @@ func (b *thematicBreakPraser) Trigger() []byte {
func (b *thematicBreakPraser) Open(parent ast.Node, reader text.Reader, pc Context) (ast.Node, State) {
line, segment := reader.PeekLine()
if isThematicBreak(line) {
if isThematicBreak(line, reader.LineOffset()) {
reader.Advance(segment.Len() - 1)
return ast.NewThematicBreak(), NoChildren
}

View file

@ -1,9 +1,12 @@
// Package html implements renderer that outputs HTMLs.
package html
import (
"bytes"
"fmt"
"strconv"
"unicode"
"unicode/utf8"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/renderer"
@ -12,19 +15,21 @@ import (
// A Config struct has configurations for the HTML based renderers.
type Config struct {
Writer Writer
HardWraps bool
XHTML bool
Unsafe bool
Writer Writer
HardWraps bool
EastAsianLineBreaks EastAsianLineBreaks
XHTML bool
Unsafe bool
}
// NewConfig returns a new Config with defaults.
func NewConfig() Config {
return Config{
Writer: DefaultWriter,
HardWraps: false,
XHTML: false,
Unsafe: false,
Writer: DefaultWriter,
HardWraps: false,
EastAsianLineBreaks: EastAsianLineBreaksNone,
XHTML: false,
Unsafe: false,
}
}
@ -33,6 +38,8 @@ func (c *Config) SetOption(name renderer.OptionName, value interface{}) {
switch name {
case optHardWraps:
c.HardWraps = value.(bool)
case optEastAsianLineBreaks:
c.EastAsianLineBreaks = value.(EastAsianLineBreaks)
case optXHTML:
c.XHTML = value.(bool)
case optUnsafe:
@ -94,6 +101,99 @@ func WithHardWraps() interface {
return &withHardWraps{}
}
// EastAsianLineBreaks is an option name used in WithEastAsianLineBreaks.
const optEastAsianLineBreaks renderer.OptionName = "EastAsianLineBreaks"
// A EastAsianLineBreaks is a style of east asian line breaks.
type EastAsianLineBreaks int
const (
//EastAsianLineBreaksNone renders line breaks as it is.
EastAsianLineBreaksNone EastAsianLineBreaks = iota
// EastAsianLineBreaksSimple follows east_asian_line_breaks in Pandoc.
EastAsianLineBreaksSimple
// EastAsianLineBreaksCSS3Draft follows CSS text level3 "Segment Break Transformation Rules" with some enhancements.
EastAsianLineBreaksCSS3Draft
)
func (b EastAsianLineBreaks) softLineBreak(thisLastRune rune, siblingFirstRune rune) bool {
switch b {
case EastAsianLineBreaksNone:
return false
case EastAsianLineBreaksSimple:
return !(util.IsEastAsianWideRune(thisLastRune) && util.IsEastAsianWideRune(siblingFirstRune))
case EastAsianLineBreaksCSS3Draft:
return eastAsianLineBreaksCSS3DraftSoftLineBreak(thisLastRune, siblingFirstRune)
}
return false
}
func eastAsianLineBreaksCSS3DraftSoftLineBreak(thisLastRune rune, siblingFirstRune rune) bool {
// Implements CSS text level3 Segment Break Transformation Rules with some enhancements.
// References:
// - https://www.w3.org/TR/2020/WD-css-text-3-20200429/#line-break-transform
// - https://github.com/w3c/csswg-drafts/issues/5086
// Rule1:
// If the character immediately before or immediately after the segment break is
// the zero-width space character (U+200B), then the break is removed, leaving behind the zero-width space.
if thisLastRune == '\u200B' || siblingFirstRune == '\u200B' {
return false
}
// Rule2:
// Otherwise, if the East Asian Width property of both the character before and after the segment break is
// F, W, or H (not A), and neither side is Hangul, then the segment break is removed.
thisLastRuneEastAsianWidth := util.EastAsianWidth(thisLastRune)
siblingFirstRuneEastAsianWidth := util.EastAsianWidth(siblingFirstRune)
if (thisLastRuneEastAsianWidth == "F" ||
thisLastRuneEastAsianWidth == "W" ||
thisLastRuneEastAsianWidth == "H") &&
(siblingFirstRuneEastAsianWidth == "F" ||
siblingFirstRuneEastAsianWidth == "W" ||
siblingFirstRuneEastAsianWidth == "H") {
return unicode.Is(unicode.Hangul, thisLastRune) || unicode.Is(unicode.Hangul, siblingFirstRune)
}
// Rule3:
// Otherwise, if either the character before or after the segment break belongs to
// the space-discarding character set and it is a Unicode Punctuation (P*) or U+3000,
// then the segment break is removed.
if util.IsSpaceDiscardingUnicodeRune(thisLastRune) ||
unicode.IsPunct(thisLastRune) ||
thisLastRune == '\u3000' ||
util.IsSpaceDiscardingUnicodeRune(siblingFirstRune) ||
unicode.IsPunct(siblingFirstRune) ||
siblingFirstRune == '\u3000' {
return false
}
// Rule4:
// Otherwise, the segment break is converted to a space (U+0020).
return true
}
type withEastAsianLineBreaks struct {
eastAsianLineBreaksStyle EastAsianLineBreaks
}
func (o *withEastAsianLineBreaks) SetConfig(c *renderer.Config) {
c.Options[optEastAsianLineBreaks] = o.eastAsianLineBreaksStyle
}
func (o *withEastAsianLineBreaks) SetHTMLOption(c *Config) {
c.EastAsianLineBreaks = o.eastAsianLineBreaksStyle
}
// WithEastAsianLineBreaks is a functional option that indicates whether softline breaks
// between east asian wide characters should be ignored.
func WithEastAsianLineBreaks(e EastAsianLineBreaks) interface {
renderer.Option
Option
} {
return &withEastAsianLineBreaks{e}
}
// XHTML is an option name used in WithXHTML.
const optXHTML renderer.OptionName = "XHTML"
@ -194,18 +294,54 @@ func (r *Renderer) writeLines(w util.BufWriter, source []byte, n ast.Node) {
}
}
func (r *Renderer) renderDocument(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
// GlobalAttributeFilter defines attribute names which any elements can have.
var GlobalAttributeFilter = util.NewBytesFilter(
[]byte("accesskey"),
[]byte("autocapitalize"),
[]byte("autofocus"),
[]byte("class"),
[]byte("contenteditable"),
[]byte("dir"),
[]byte("draggable"),
[]byte("enterkeyhint"),
[]byte("hidden"),
[]byte("id"),
[]byte("inert"),
[]byte("inputmode"),
[]byte("is"),
[]byte("itemid"),
[]byte("itemprop"),
[]byte("itemref"),
[]byte("itemscope"),
[]byte("itemtype"),
[]byte("lang"),
[]byte("part"),
[]byte("role"),
[]byte("slot"),
[]byte("spellcheck"),
[]byte("style"),
[]byte("tabindex"),
[]byte("title"),
[]byte("translate"),
)
func (r *Renderer) renderDocument(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
// nothing to do
return ast.WalkContinue, nil
}
func (r *Renderer) renderHeading(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
// HeadingAttributeFilter defines attribute names which heading elements can have.
var HeadingAttributeFilter = GlobalAttributeFilter
func (r *Renderer) renderHeading(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.Heading)
if entering {
_, _ = w.WriteString("<h")
_ = w.WriteByte("0123456"[n.Level])
if n.Attributes() != nil {
r.RenderAttributes(w, node)
RenderAttributes(w, node, HeadingAttributeFilter)
}
_ = w.WriteByte('>')
} else {
@ -216,9 +352,21 @@ func (r *Renderer) renderHeading(w util.BufWriter, source []byte, node ast.Node,
return ast.WalkContinue, nil
}
func (r *Renderer) renderBlockquote(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
// BlockquoteAttributeFilter defines attribute names which blockquote elements can have.
var BlockquoteAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("cite"),
)
func (r *Renderer) renderBlockquote(
w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<blockquote>\n")
if n.Attributes() != nil {
_, _ = w.WriteString("<blockquote")
RenderAttributes(w, n, BlockquoteAttributeFilter)
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("<blockquote>\n")
}
} else {
_, _ = w.WriteString("</blockquote>\n")
}
@ -235,7 +383,8 @@ func (r *Renderer) renderCodeBlock(w util.BufWriter, source []byte, n ast.Node,
return ast.WalkContinue, nil
}
func (r *Renderer) renderFencedCodeBlock(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
func (r *Renderer) renderFencedCodeBlock(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.FencedCodeBlock)
if entering {
_, _ = w.WriteString("<pre><code")
@ -253,14 +402,15 @@ func (r *Renderer) renderFencedCodeBlock(w util.BufWriter, source []byte, node a
return ast.WalkContinue, nil
}
func (r *Renderer) renderHTMLBlock(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
func (r *Renderer) renderHTMLBlock(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.HTMLBlock)
if entering {
if r.Unsafe {
l := n.Lines().Len()
for i := 0; i < l; i++ {
line := n.Lines().At(i)
_, _ = w.Write(line.Value(source))
r.Writer.SecureWrite(w, line.Value(source))
}
} else {
_, _ = w.WriteString("<!-- raw HTML omitted -->\n")
@ -269,7 +419,7 @@ func (r *Renderer) renderHTMLBlock(w util.BufWriter, source []byte, node ast.Nod
if n.HasClosure() {
if r.Unsafe {
closure := n.ClosureLine
_, _ = w.Write(closure.Value(source))
r.Writer.SecureWrite(w, closure.Value(source))
} else {
_, _ = w.WriteString("<!-- raw HTML omitted -->\n")
}
@ -278,6 +428,13 @@ func (r *Renderer) renderHTMLBlock(w util.BufWriter, source []byte, node ast.Nod
return ast.WalkContinue, nil
}
// ListAttributeFilter defines attribute names which list elements can have.
var ListAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("start"),
[]byte("reversed"),
[]byte("type"),
)
func (r *Renderer) renderList(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.List)
tag := "ul"
@ -288,10 +445,12 @@ func (r *Renderer) renderList(w util.BufWriter, source []byte, node ast.Node, en
_ = w.WriteByte('<')
_, _ = w.WriteString(tag)
if n.IsOrdered() && n.Start != 1 {
fmt.Fprintf(w, " start=\"%d\">\n", n.Start)
} else {
_, _ = w.WriteString(">\n")
_, _ = fmt.Fprintf(w, " start=\"%d\"", n.Start)
}
if n.Attributes() != nil {
RenderAttributes(w, n, ListAttributeFilter)
}
_, _ = w.WriteString(">\n")
} else {
_, _ = w.WriteString("</")
_, _ = w.WriteString(tag)
@ -300,9 +459,20 @@ func (r *Renderer) renderList(w util.BufWriter, source []byte, node ast.Node, en
return ast.WalkContinue, nil
}
// ListItemAttributeFilter defines attribute names which list item elements can have.
var ListItemAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("value"),
)
func (r *Renderer) renderListItem(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<li>")
if n.Attributes() != nil {
_, _ = w.WriteString("<li")
RenderAttributes(w, n, ListItemAttributeFilter)
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("<li>")
}
fc := n.FirstChild()
if fc != nil {
if _, ok := fc.(*ast.TextBlock); !ok {
@ -315,9 +485,18 @@ func (r *Renderer) renderListItem(w util.BufWriter, source []byte, n ast.Node, e
return ast.WalkContinue, nil
}
// ParagraphAttributeFilter defines attribute names which paragraph elements can have.
var ParagraphAttributeFilter = GlobalAttributeFilter
func (r *Renderer) renderParagraph(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<p>")
if n.Attributes() != nil {
_, _ = w.WriteString("<p")
RenderAttributes(w, n, ParagraphAttributeFilter)
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("<p>")
}
} else {
_, _ = w.WriteString("</p>\n")
}
@ -326,26 +505,54 @@ func (r *Renderer) renderParagraph(w util.BufWriter, source []byte, n ast.Node,
func (r *Renderer) renderTextBlock(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if !entering {
if _, ok := n.NextSibling().(ast.Node); ok && n.FirstChild() != nil {
if n.NextSibling() != nil && n.FirstChild() != nil {
_ = w.WriteByte('\n')
}
}
return ast.WalkContinue, nil
}
func (r *Renderer) renderThematicBreak(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
// ThematicAttributeFilter defines attribute names which hr elements can have.
var ThematicAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("align"), // [Deprecated]
[]byte("color"), // [Not Standardized]
[]byte("noshade"), // [Deprecated]
[]byte("size"), // [Deprecated]
[]byte("width"), // [Deprecated]
)
func (r *Renderer) renderThematicBreak(
w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if !entering {
return ast.WalkContinue, nil
}
_, _ = w.WriteString("<hr")
if n.Attributes() != nil {
RenderAttributes(w, n, ThematicAttributeFilter)
}
if r.XHTML {
_, _ = w.WriteString("<hr />\n")
_, _ = w.WriteString(" />\n")
} else {
_, _ = w.WriteString("<hr>\n")
_, _ = w.WriteString(">\n")
}
return ast.WalkContinue, nil
}
func (r *Renderer) renderAutoLink(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
// LinkAttributeFilter defines attribute names which link elements can have.
var LinkAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("download"),
// []byte("href"),
[]byte("hreflang"),
[]byte("media"),
[]byte("ping"),
[]byte("referrerpolicy"),
[]byte("rel"),
[]byte("shape"),
[]byte("target"),
)
func (r *Renderer) renderAutoLink(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.AutoLink)
if !entering {
return ast.WalkContinue, nil
@ -357,23 +564,36 @@ func (r *Renderer) renderAutoLink(w util.BufWriter, source []byte, node ast.Node
_, _ = w.WriteString("mailto:")
}
_, _ = w.Write(util.EscapeHTML(util.URLEscape(url, false)))
_, _ = w.WriteString(`">`)
if n.Attributes() != nil {
_ = w.WriteByte('"')
RenderAttributes(w, n, LinkAttributeFilter)
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString(`">`)
}
_, _ = w.Write(util.EscapeHTML(label))
_, _ = w.WriteString(`</a>`)
return ast.WalkContinue, nil
}
// CodeAttributeFilter defines attribute names which code elements can have.
var CodeAttributeFilter = GlobalAttributeFilter
func (r *Renderer) renderCodeSpan(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if entering {
_, _ = w.WriteString("<code>")
if n.Attributes() != nil {
_, _ = w.WriteString("<code")
RenderAttributes(w, n, CodeAttributeFilter)
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("<code>")
}
for c := n.FirstChild(); c != nil; c = c.NextSibling() {
segment := c.(*ast.Text).Segment
value := segment.Value(source)
if bytes.HasSuffix(value, []byte("\n")) {
r.Writer.RawWrite(w, value[:len(value)-1])
if c != n.LastChild() {
r.Writer.RawWrite(w, []byte(" "))
}
r.Writer.RawWrite(w, []byte(" "))
} else {
r.Writer.RawWrite(w, value)
}
@ -384,7 +604,11 @@ func (r *Renderer) renderCodeSpan(w util.BufWriter, source []byte, n ast.Node, e
return ast.WalkContinue, nil
}
func (r *Renderer) renderEmphasis(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
// EmphasisAttributeFilter defines attribute names which emphasis elements can have.
var EmphasisAttributeFilter = GlobalAttributeFilter
func (r *Renderer) renderEmphasis(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.Emphasis)
tag := "em"
if n.Level == 2 {
@ -393,6 +617,9 @@ func (r *Renderer) renderEmphasis(w util.BufWriter, source []byte, node ast.Node
if entering {
_ = w.WriteByte('<')
_, _ = w.WriteString(tag)
if n.Attributes() != nil {
RenderAttributes(w, n, EmphasisAttributeFilter)
}
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("</")
@ -415,12 +642,34 @@ func (r *Renderer) renderLink(w util.BufWriter, source []byte, node ast.Node, en
r.Writer.Write(w, n.Title)
_ = w.WriteByte('"')
}
if n.Attributes() != nil {
RenderAttributes(w, n, LinkAttributeFilter)
}
_ = w.WriteByte('>')
} else {
_, _ = w.WriteString("</a>")
}
return ast.WalkContinue, nil
}
// ImageAttributeFilter defines attribute names which image elements can have.
var ImageAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("align"),
[]byte("border"),
[]byte("crossorigin"),
[]byte("decoding"),
[]byte("height"),
[]byte("importance"),
[]byte("intrinsicsize"),
[]byte("ismap"),
[]byte("loading"),
[]byte("referrerpolicy"),
[]byte("sizes"),
[]byte("srcset"),
[]byte("usemap"),
[]byte("width"),
)
func (r *Renderer) renderImage(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
if !entering {
return ast.WalkContinue, nil
@ -431,13 +680,16 @@ func (r *Renderer) renderImage(w util.BufWriter, source []byte, node ast.Node, e
_, _ = w.Write(util.EscapeHTML(util.URLEscape(n.Destination, true)))
}
_, _ = w.WriteString(`" alt="`)
_, _ = w.Write(n.Text(source))
r.renderTexts(w, source, n)
_ = w.WriteByte('"')
if n.Title != nil {
_, _ = w.WriteString(` title="`)
r.Writer.Write(w, n.Title)
_ = w.WriteByte('"')
}
if n.Attributes() != nil {
RenderAttributes(w, n, ImageAttributeFilter)
}
if r.XHTML {
_, _ = w.WriteString(" />")
} else {
@ -446,7 +698,8 @@ func (r *Renderer) renderImage(w util.BufWriter, source []byte, node ast.Node, e
return ast.WalkSkipChildren, nil
}
func (r *Renderer) renderRawHTML(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
func (r *Renderer) renderRawHTML(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
if !entering {
return ast.WalkSkipChildren, nil
}
@ -472,7 +725,8 @@ func (r *Renderer) renderText(w util.BufWriter, source []byte, node ast.Node, en
if n.IsRaw() {
r.Writer.RawWrite(w, segment.Value(source))
} else {
r.Writer.Write(w, segment.Value(source))
value := segment.Value(source)
r.Writer.Write(w, value)
if n.HardLineBreak() || (n.SoftLineBreak() && r.HardWraps) {
if r.XHTML {
_, _ = w.WriteString("<br />\n")
@ -480,7 +734,20 @@ func (r *Renderer) renderText(w util.BufWriter, source []byte, node ast.Node, en
_, _ = w.WriteString("<br>\n")
}
} else if n.SoftLineBreak() {
_ = w.WriteByte('\n')
if r.EastAsianLineBreaks != EastAsianLineBreaksNone && len(value) != 0 {
sibling := node.NextSibling()
if sibling != nil && sibling.Kind() == ast.KindText {
if siblingText := sibling.(*ast.Text).Value(source); len(siblingText) != 0 {
thisLastRune := util.ToRune(value, len(value)-1)
siblingFirstRune, _ := utf8.DecodeRune(siblingText)
if r.EastAsianLineBreaks.softLineBreak(thisLastRune, siblingFirstRune) {
_ = w.WriteByte('\n')
}
}
}
} else {
_ = w.WriteByte('\n')
}
}
}
return ast.WalkContinue, nil
@ -503,30 +770,89 @@ func (r *Renderer) renderString(w util.BufWriter, source []byte, node ast.Node,
return ast.WalkContinue, nil
}
// RenderAttributes renders given node's attributes.
func (r *Renderer) RenderAttributes(w util.BufWriter, node ast.Node) {
func (r *Renderer) renderTexts(w util.BufWriter, source []byte, n ast.Node) {
for c := n.FirstChild(); c != nil; c = c.NextSibling() {
if s, ok := c.(*ast.String); ok {
_, _ = r.renderString(w, source, s, true)
} else if t, ok := c.(*ast.Text); ok {
_, _ = r.renderText(w, source, t, true)
} else {
r.renderTexts(w, source, c)
}
}
}
var dataPrefix = []byte("data-")
// RenderAttributes renders given node's attributes.
// You can specify attribute names to render by the filter.
// If filter is nil, RenderAttributes renders all attributes.
func RenderAttributes(w util.BufWriter, node ast.Node, filter util.BytesFilter) {
for _, attr := range node.Attributes() {
if filter != nil && !filter.Contains(attr.Name) {
if !bytes.HasPrefix(attr.Name, dataPrefix) {
continue
}
}
_, _ = w.WriteString(" ")
_, _ = w.Write(attr.Name)
_, _ = w.WriteString(`="`)
_, _ = w.Write(util.EscapeHTML(attr.Value.([]byte)))
// TODO: convert numeric values to strings
var value []byte
switch typed := attr.Value.(type) {
case []byte:
value = typed
case string:
value = util.StringToReadOnlyBytes(typed)
}
_, _ = w.Write(util.EscapeHTML(value))
_ = w.WriteByte('"')
}
}
// A Writer interface wirtes textual contents to a writer.
// A Writer interface writes textual contents to a writer.
type Writer interface {
// Write writes the given source to writer with resolving references and unescaping
// backslash escaped characters.
Write(writer util.BufWriter, source []byte)
// RawWrite wirtes the given source to writer without resolving references and
// RawWrite writes the given source to writer without resolving references and
// unescaping backslash escaped characters.
RawWrite(writer util.BufWriter, source []byte)
// SecureWrite writes the given source to writer with replacing insecure characters.
SecureWrite(writer util.BufWriter, source []byte)
}
var replacementCharacter = []byte("\ufffd")
// A WriterConfig struct has configurations for the HTML based writers.
type WriterConfig struct {
// EscapedSpace is an option that indicates that a '\' escaped half-space(0x20) should not be rendered.
EscapedSpace bool
}
// A WriterOption interface sets options for HTML based writers.
type WriterOption func(*WriterConfig)
// WithEscapedSpace is a WriterOption indicates that a '\' escaped half-space(0x20) should not be rendered.
func WithEscapedSpace() WriterOption {
return func(c *WriterConfig) {
c.EscapedSpace = true
}
}
type defaultWriter struct {
WriterConfig
}
// NewWriter returns a new Writer.
func NewWriter(opts ...WriterOption) Writer {
w := &defaultWriter{}
for _, opt := range opts {
opt(&w.WriterConfig)
}
return w
}
func escapeRune(writer util.BufWriter, r rune) {
@ -540,6 +866,23 @@ func escapeRune(writer util.BufWriter, r rune) {
_, _ = writer.WriteRune(util.ToValidRune(r))
}
func (d *defaultWriter) SecureWrite(writer util.BufWriter, source []byte) {
n := 0
l := len(source)
for i := 0; i < l; i++ {
if source[i] == '\u0000' {
_, _ = writer.Write(source[i-n : i])
n = 0
_, _ = writer.Write(replacementCharacter)
continue
}
n++
}
if n != 0 {
_, _ = writer.Write(source[l-n:])
}
}
func (d *defaultWriter) RawWrite(writer util.BufWriter, source []byte) {
n := 0
l := len(source)
@ -572,6 +915,19 @@ func (d *defaultWriter) Write(writer util.BufWriter, source []byte) {
escaped = false
continue
}
if d.EscapedSpace && c == ' ' {
d.RawWrite(writer, source[n:i-1])
n = i + 1
escaped = false
continue
}
}
if c == '\x00' {
d.RawWrite(writer, source[n:i])
d.RawWrite(writer, replacementCharacter)
n = i + 1
escaped = false
continue
}
if c == '&' {
pos := i
@ -584,7 +940,7 @@ func (d *defaultWriter) Write(writer util.BufWriter, source []byte) {
if nnext < limit && nc == 'x' || nc == 'X' {
start := nnext + 1
i, ok = util.ReadWhile(source, [2]int{start, limit}, util.IsHexDecimal)
if ok && i < limit && source[i] == ';' {
if ok && i < limit && source[i] == ';' && i-start < 7 {
v, _ := strconv.ParseUint(util.BytesToReadOnlyString(source[start:i]), 16, 32)
d.RawWrite(writer, source[n:pos])
n = i + 1
@ -596,7 +952,7 @@ func (d *defaultWriter) Write(writer util.BufWriter, source []byte) {
start := nnext
i, ok = util.ReadWhile(source, [2]int{start, limit}, util.IsNumeric)
if ok && i < limit && i-start < 8 && source[i] == ';' {
v, _ := strconv.ParseUint(util.BytesToReadOnlyString(source[start:i]), 0, 32)
v, _ := strconv.ParseUint(util.BytesToReadOnlyString(source[start:i]), 10, 32)
d.RawWrite(writer, source[n:pos])
n = i + 1
escapeRune(writer, rune(v))
@ -630,30 +986,36 @@ func (d *defaultWriter) Write(writer util.BufWriter, source []byte) {
d.RawWrite(writer, source[n:])
}
// DefaultWriter is a default implementation of the Writer.
var DefaultWriter = &defaultWriter{}
// DefaultWriter is a default instance of the Writer.
var DefaultWriter = NewWriter()
var bDataImage = []byte("data:image/")
var bPng = []byte("png;")
var bGif = []byte("gif;")
var bJpeg = []byte("jpeg;")
var bWebp = []byte("webp;")
var bSvg = []byte("svg+xml;")
var bJs = []byte("javascript:")
var bVb = []byte("vbscript:")
var bFile = []byte("file:")
var bData = []byte("data:")
func hasPrefix(s, prefix []byte) bool {
return len(s) >= len(prefix) && bytes.Equal(bytes.ToLower(s[0:len(prefix)]), bytes.ToLower(prefix))
}
// IsDangerousURL returns true if the given url seems a potentially dangerous url,
// otherwise false.
func IsDangerousURL(url []byte) bool {
if bytes.HasPrefix(url, bDataImage) && len(url) >= 11 {
if hasPrefix(url, bDataImage) && len(url) >= 11 {
v := url[11:]
if bytes.HasPrefix(v, bPng) || bytes.HasPrefix(v, bGif) ||
bytes.HasPrefix(v, bJpeg) || bytes.HasPrefix(v, bWebp) {
if hasPrefix(v, bPng) || hasPrefix(v, bGif) ||
hasPrefix(v, bJpeg) || hasPrefix(v, bWebp) ||
hasPrefix(v, bSvg) {
return false
}
return true
}
return bytes.HasPrefix(url, bJs) || bytes.HasPrefix(url, bVb) ||
bytes.HasPrefix(url, bFile) || bytes.HasPrefix(url, bData)
return hasPrefix(url, bJs) || hasPrefix(url, bVb) ||
hasPrefix(url, bFile) || hasPrefix(url, bData)
}

View file

@ -1,14 +1,13 @@
// Package renderer renders the given AST to certain formats.
// Package renderer renders the given AST to certain formats.
package renderer
import (
"bufio"
"io"
"sync"
"github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/util"
"sync"
)
// A Config struct is a data structure that holds configuration of the Renderer.
@ -17,7 +16,7 @@ type Config struct {
NodeRenderers util.PrioritizedSlice
}
// NewConfig returns a new Config
// NewConfig returns a new Config.
func NewConfig() *Config {
return &Config{
Options: map[OptionName]interface{}{},
@ -79,7 +78,7 @@ type NodeRenderer interface {
RegisterFuncs(NodeRendererFuncRegisterer)
}
// A NodeRendererFuncRegisterer registers
// A NodeRendererFuncRegisterer registers given NodeRendererFunc to this object.
type NodeRendererFuncRegisterer interface {
// Register registers given NodeRendererFunc to this object.
Register(ast.NodeKind, NodeRendererFunc)
@ -90,7 +89,7 @@ type NodeRendererFuncRegisterer interface {
type Renderer interface {
Render(w io.Writer, source []byte, n ast.Node) error
// AddOptions adds given option to thie parser.
// AddOptions adds given option to this renderer.
AddOptions(...Option)
}

View file

@ -1,10 +1,14 @@
// Package testutil provides utilities for unit tests.
package testutil
import (
"bufio"
"bytes"
"encoding/hex"
"encoding/json"
"fmt"
"os"
"regexp"
"runtime/debug"
"strconv"
"strings"
@ -22,27 +26,82 @@ type TestingT interface {
FailNow()
}
// MarkdownTestCase represents a test case.
type MarkdownTestCase struct {
No int
Markdown string
Expected string
No int
Description string
Options MarkdownTestCaseOptions
Markdown string
Expected string
}
func source(t *MarkdownTestCase) string {
ret := t.Markdown
if t.Options.Trim {
ret = strings.TrimSpace(ret)
}
if t.Options.EnableEscape {
return string(applyEscapeSequence([]byte(ret)))
}
return ret
}
func expected(t *MarkdownTestCase) string {
ret := t.Expected
if t.Options.Trim {
ret = strings.TrimSpace(ret)
}
if t.Options.EnableEscape {
return string(applyEscapeSequence([]byte(ret)))
}
return ret
}
// MarkdownTestCaseOptions represents options for each test case.
type MarkdownTestCaseOptions struct {
EnableEscape bool
Trim bool
}
const attributeSeparator = "//- - - - - - - - -//"
const caseSeparator = "//= = = = = = = = = = = = = = = = = = = = = = = =//"
func DoTestCaseFile(m goldmark.Markdown, filename string, t TestingT) {
var optionsRegexp = regexp.MustCompile(`(?i)\s*options:(.*)`)
// ParseCliCaseArg parses -case command line args.
func ParseCliCaseArg() []int {
ret := []int{}
for _, a := range os.Args {
if strings.HasPrefix(a, "case=") {
parts := strings.Split(a, "=")
for _, cas := range strings.Split(parts[1], ",") {
value, err := strconv.Atoi(strings.TrimSpace(cas))
if err == nil {
ret = append(ret, value)
}
}
}
}
return ret
}
// DoTestCaseFile runs test cases in a given file.
func DoTestCaseFile(m goldmark.Markdown, filename string, t TestingT, no ...int) {
fp, err := os.Open(filename)
if err != nil {
panic(err)
}
defer fp.Close()
defer func() {
_ = fp.Close()
}()
scanner := bufio.NewScanner(fp)
c := MarkdownTestCase{
No: -1,
Markdown: "",
Expected: "",
No: -1,
Description: "",
Options: MarkdownTestCaseOptions{},
Markdown: "",
Expected: "",
}
cases := []MarkdownTestCase{}
line := 0
@ -51,7 +110,15 @@ func DoTestCaseFile(m goldmark.Markdown, filename string, t TestingT) {
if util.IsBlank([]byte(scanner.Text())) {
continue
}
c.No, err = strconv.Atoi(scanner.Text())
header := scanner.Text()
c.Description = ""
if strings.Contains(header, ":") {
parts := strings.Split(header, ":")
c.No, err = strconv.Atoi(strings.TrimSpace(parts[0]))
c.Description = strings.Join(parts[1:], ":")
} else {
c.No, err = strconv.Atoi(scanner.Text())
}
if err != nil {
panic(fmt.Sprintf("%s: invalid case No at line %d", filename, line))
}
@ -59,6 +126,15 @@ func DoTestCaseFile(m goldmark.Markdown, filename string, t TestingT) {
panic(fmt.Sprintf("%s: invalid case at line %d", filename, line))
}
line++
matches := optionsRegexp.FindAllStringSubmatch(scanner.Text(), -1)
if len(matches) != 0 {
err = json.Unmarshal([]byte(matches[0][1]), &c.Options)
if err != nil {
panic(fmt.Sprintf("%s: invalid options at line %d", filename, line))
}
scanner.Scan()
line++
}
if scanner.Text() != attributeSeparator {
panic(fmt.Sprintf("%s: invalid separator '%s' at line %d", filename, scanner.Text(), line))
}
@ -82,23 +158,43 @@ func DoTestCaseFile(m goldmark.Markdown, filename string, t TestingT) {
buf = append(buf, text)
}
c.Expected = strings.Join(buf, "\n")
cases = append(cases, c)
if len(c.Expected) != 0 {
c.Expected = c.Expected + "\n"
}
shouldAdd := len(no) == 0
if !shouldAdd {
for _, n := range no {
if n == c.No {
shouldAdd = true
break
}
}
}
if shouldAdd {
cases = append(cases, c)
}
}
DoTestCases(m, cases, t)
}
func DoTestCases(m goldmark.Markdown, cases []MarkdownTestCase, t TestingT) {
// DoTestCases runs a set of test cases.
func DoTestCases(m goldmark.Markdown, cases []MarkdownTestCase, t TestingT, opts ...parser.ParseOption) {
for _, testCase := range cases {
DoTestCase(m, testCase, t)
DoTestCase(m, testCase, t, opts...)
}
}
func DoTestCase(m goldmark.Markdown, testCase MarkdownTestCase, t TestingT) {
// DoTestCase runs a test case.
func DoTestCase(m goldmark.Markdown, testCase MarkdownTestCase, t TestingT, opts ...parser.ParseOption) {
var ok bool
var out bytes.Buffer
defer func() {
description := ""
if len(testCase.Description) != 0 {
description = ": " + testCase.Description
}
if err := recover(); err != nil {
format := `============= case %d ================
format := `============= case %d%s ================
Markdown:
-----------
%s
@ -112,9 +208,9 @@ Actual
%v
%s
`
t.Errorf(format, testCase.No, testCase.Markdown, testCase.Expected, err, debug.Stack())
t.Errorf(format, testCase.No, description, source(&testCase), expected(&testCase), err, debug.Stack())
} else if !ok {
format := `============= case %d ================
format := `============= case %d%s ================
Markdown:
-----------
%s
@ -126,13 +222,188 @@ Expected:
Actual
---------
%s
Diff
---------
%s
`
t.Errorf(format, testCase.No, testCase.Markdown, testCase.Expected, out.Bytes())
t.Errorf(format, testCase.No, description, source(&testCase), expected(&testCase), out.Bytes(),
DiffPretty([]byte(expected(&testCase)), out.Bytes()))
}
}()
if err := m.Convert([]byte(testCase.Markdown), &out, parser.WithWorkers(16)); err != nil {
if err := m.Convert([]byte(source(&testCase)), &out, opts...); err != nil {
panic(err)
}
ok = bytes.Equal(bytes.TrimSpace(out.Bytes()), bytes.TrimSpace([]byte(testCase.Expected)))
ok = bytes.Equal(bytes.TrimSpace(out.Bytes()), bytes.TrimSpace([]byte(expected(&testCase))))
}
type diffType int
const (
diffRemoved diffType = iota
diffAdded
diffNone
)
type diff struct {
Type diffType
Lines [][]byte
}
func simpleDiff(v1, v2 []byte) []diff {
return simpleDiffAux(
bytes.Split(v1, []byte("\n")),
bytes.Split(v2, []byte("\n")))
}
func simpleDiffAux(v1lines, v2lines [][]byte) []diff {
v1index := map[string][]int{}
for i, line := range v1lines {
key := util.BytesToReadOnlyString(line)
if _, ok := v1index[key]; !ok {
v1index[key] = []int{}
}
v1index[key] = append(v1index[key], i)
}
overlap := map[int]int{}
v1start := 0
v2start := 0
length := 0
for v2pos, line := range v2lines {
newOverlap := map[int]int{}
key := util.BytesToReadOnlyString(line)
if _, ok := v1index[key]; !ok {
v1index[key] = []int{}
}
for _, v1pos := range v1index[key] {
value := 0
if v1pos != 0 {
if v, ok := overlap[v1pos-1]; ok {
value = v
}
}
newOverlap[v1pos] = value + 1
if newOverlap[v1pos] > length {
length = newOverlap[v1pos]
v1start = v1pos - length + 1
v2start = v2pos - length + 1
}
}
overlap = newOverlap
}
if length == 0 {
diffs := []diff{}
if len(v1lines) != 0 {
diffs = append(diffs, diff{diffRemoved, v1lines})
}
if len(v2lines) != 0 {
diffs = append(diffs, diff{diffAdded, v2lines})
}
return diffs
}
diffs := simpleDiffAux(v1lines[:v1start], v2lines[:v2start])
diffs = append(diffs, diff{diffNone, v2lines[v2start : v2start+length]})
diffs = append(diffs, simpleDiffAux(v1lines[v1start+length:],
v2lines[v2start+length:])...)
return diffs
}
// DiffPretty returns pretty formatted diff between given bytes.
func DiffPretty(v1, v2 []byte) []byte {
var b bytes.Buffer
diffs := simpleDiff(v1, v2)
for _, diff := range diffs {
c := " "
switch diff.Type {
case diffAdded:
c = "+"
case diffRemoved:
c = "-"
case diffNone:
c = " "
}
for _, line := range diff.Lines {
if c != " " {
b.WriteString(fmt.Sprintf("%s | %s\n", c, util.VisualizeSpaces(line)))
} else {
b.WriteString(fmt.Sprintf("%s | %s\n", c, line))
}
}
}
return b.Bytes()
}
func applyEscapeSequence(b []byte) []byte {
result := make([]byte, 0, len(b))
for i := 0; i < len(b); i++ {
if b[i] == '\\' && i != len(b)-1 {
switch b[i+1] {
case 'a':
result = append(result, '\a')
i++
continue
case 'b':
result = append(result, '\b')
i++
continue
case 'f':
result = append(result, '\f')
i++
continue
case 'n':
result = append(result, '\n')
i++
continue
case 'r':
result = append(result, '\r')
i++
continue
case 't':
result = append(result, '\t')
i++
continue
case 'v':
result = append(result, '\v')
i++
continue
case '\\':
result = append(result, '\\')
i++
continue
case 'x':
if len(b) >= i+3 && util.IsHexDecimal(b[i+2]) && util.IsHexDecimal(b[i+3]) {
v, _ := hex.DecodeString(string(b[i+2 : i+4]))
result = append(result, v[0])
i += 3
continue
}
case 'u', 'U':
if len(b) > i+2 {
num := []byte{}
for j := i + 2; j < len(b); j++ {
if util.IsHexDecimal(b[j]) {
num = append(num, b[j])
continue
}
break
}
if len(num) >= 4 && len(num) < 8 {
v, _ := strconv.ParseInt(string(num[:4]), 16, 32)
result = append(result, []byte(string(rune(v)))...)
i += 5
continue
}
if len(num) >= 8 {
v, _ := strconv.ParseInt(string(num[:8]), 16, 32)
result = append(result, []byte(string(rune(v)))...)
i += 9
continue
}
}
}
}
result = append(result, b[i])
}
return result
}

2
text/package.go Normal file
View file

@ -0,0 +1,2 @@
// Package text provides functionalities to manipulate texts.
package text

View file

@ -1,10 +1,12 @@
package text
import (
"github.com/yuin/goldmark/util"
"bytes"
"io"
"regexp"
"unicode/utf8"
"github.com/yuin/goldmark/util"
)
const invalidValue = -1
@ -69,6 +71,28 @@ type Reader interface {
// Match performs regular expression searching to current line.
FindSubMatch(reg *regexp.Regexp) [][]byte
// FindClosure finds corresponding closure.
FindClosure(opener, closer byte, options FindClosureOptions) (*Segments, bool)
}
// FindClosureOptions is options for Reader.FindClosure.
type FindClosureOptions struct {
// CodeSpan is a flag for the FindClosure. If this is set to true,
// FindClosure ignores closers in codespans.
CodeSpan bool
// Nesting is a flag for the FindClosure. If this is set to true,
// FindClosure allows nesting.
Nesting bool
// Newline is a flag for the FindClosure. If this is set to true,
// FindClosure searches for a closer over multiple lines.
Newline bool
// Advance is a flag for the FindClosure. If this is set to true,
// FindClosure advances pointers when closer is found.
Advance bool
}
type reader struct {
@ -91,6 +115,10 @@ func NewReader(source []byte) Reader {
return r
}
func (r *reader) FindClosure(opener, closer byte, options FindClosureOptions) (*Segments, bool) {
return findClosureReader(r, opener, closer, options)
}
func (r *reader) ResetPosition() {
r.line = -1
r.head = 0
@ -126,7 +154,7 @@ func (r *reader) PeekLine() ([]byte, Segment) {
return nil, r.pos
}
// io.RuneReader interface
// io.RuneReader interface.
func (r *reader) ReadRune() (rune, int, error) {
return readRuneReader(r)
}
@ -138,7 +166,7 @@ func (r *reader) LineOffset() int {
if r.source[i] == '\t' {
v += util.TabWidth(v)
} else {
v += 1
v++
}
}
r.lineOffset = v - r.pos.Padding
@ -271,6 +299,10 @@ func NewBlockReader(source []byte, segments *Segments) BlockReader {
return r
}
func (r *blockReader) FindClosure(opener, closer byte, options FindClosureOptions) (*Segments, bool) {
return findClosureReader(r, opener, closer, options)
}
func (r *blockReader) ResetPosition() {
r.line = -1
r.head = 0
@ -322,7 +354,7 @@ func (r *blockReader) Value(seg Segment) []byte {
return ret
}
// io.RuneReader interface
// io.RuneReader interface.
func (r *blockReader) ReadRune() (rune, int, error) {
return readRuneReader(r)
}
@ -331,7 +363,11 @@ func (r *blockReader) PrecendingCharacter() rune {
if r.pos.Padding != 0 {
return rune(' ')
}
if r.pos.Start <= 0 {
if r.segments.Len() < 1 {
return rune('\n')
}
firstSegment := r.segments.At(0)
if r.line == 0 && r.pos.Start <= firstSegment.Start {
return rune('\n')
}
l := len(r.source)
@ -355,7 +391,7 @@ func (r *blockReader) LineOffset() int {
if r.source[i] == '\t' {
v += util.TabWidth(v)
} else {
v += 1
v++
}
}
r.lineOffset = v - r.pos.Padding
@ -502,24 +538,30 @@ func matchReader(r Reader, reg *regexp.Regexp) bool {
}
func findSubMatchReader(r Reader, reg *regexp.Regexp) [][]byte {
oldline, oldseg := r.Position()
oldLine, oldSeg := r.Position()
match := reg.FindReaderSubmatchIndex(r)
r.SetPosition(oldline, oldseg)
r.SetPosition(oldLine, oldSeg)
if match == nil {
return nil
}
runes := make([]rune, 0, match[1]-match[0])
var bb bytes.Buffer
bb.Grow(match[1] - match[0])
for i := 0; i < match[1]; {
r, size, _ := readRuneReader(r)
i += size
runes = append(runes, r)
bb.WriteRune(r)
}
result := [][]byte{}
bs := bb.Bytes()
var result [][]byte
for i := 0; i < len(match); i += 2 {
result = append(result, []byte(string(runes[match[i]:match[i+1]])))
if match[i] < 0 {
result = append(result, []byte{})
continue
}
result = append(result, bs[match[i]:match[i+1]])
}
r.SetPosition(oldline, oldseg)
r.SetPosition(oldLine, oldSeg)
r.Advance(match[1] - match[0])
return result
}
@ -536,3 +578,83 @@ func readRuneReader(r Reader) (rune, int, error) {
r.Advance(size)
return rn, size, nil
}
func findClosureReader(r Reader, opener, closer byte, opts FindClosureOptions) (*Segments, bool) {
opened := 1
codeSpanOpener := 0
closed := false
orgline, orgpos := r.Position()
var ret *Segments
for {
bs, seg := r.PeekLine()
if bs == nil {
goto end
}
i := 0
for i < len(bs) {
c := bs[i]
if opts.CodeSpan && codeSpanOpener != 0 && c == '`' {
codeSpanCloser := 0
for ; i < len(bs); i++ {
if bs[i] == '`' {
codeSpanCloser++
} else {
i--
break
}
}
if codeSpanCloser == codeSpanOpener {
codeSpanOpener = 0
}
} else if codeSpanOpener == 0 && c == '\\' && i < len(bs)-1 && util.IsPunct(bs[i+1]) {
i += 2
continue
} else if opts.CodeSpan && codeSpanOpener == 0 && c == '`' {
for ; i < len(bs); i++ {
if bs[i] == '`' {
codeSpanOpener++
} else {
i--
break
}
}
} else if (opts.CodeSpan && codeSpanOpener == 0) || !opts.CodeSpan {
if c == closer {
opened--
if opened == 0 {
if ret == nil {
ret = NewSegments()
}
ret.Append(seg.WithStop(seg.Start + i))
r.Advance(i + 1)
closed = true
goto end
}
} else if c == opener {
if !opts.Nesting {
goto end
}
opened++
}
}
i++
}
if !opts.Newline {
goto end
}
r.AdvanceLine()
if ret == nil {
ret = NewSegments()
}
ret.Append(seg)
}
end:
if !opts.Advance {
r.SetPosition(orgline, orgpos)
}
if closed {
return ret, true
}
return nil, false
}

16
text/reader_test.go Normal file
View file

@ -0,0 +1,16 @@
package text
import (
"regexp"
"testing"
)
func TestFindSubMatchReader(t *testing.T) {
s := "微笑"
r := NewReader([]byte(":" + s + ":"))
reg := regexp.MustCompile(`:(\p{L}+):`)
match := r.FindSubMatch(reg)
if len(match) != 2 || string(match[1]) != s {
t.Fatal("no match cjk")
}
}

View file

@ -2,12 +2,13 @@ package text
import (
"bytes"
"github.com/yuin/goldmark/util"
)
var space = []byte(" ")
// A Segment struct holds information about source potisions.
// A Segment struct holds information about source positions.
type Segment struct {
// Start is a start position of the segment.
Start int
@ -18,6 +19,20 @@ type Segment struct {
// Padding is a padding length of the segment.
Padding int
// ForceNewline is true if the segment should be ended with a newline.
// Some elements(i.e. CodeBlock, FencedCodeBlock) does not trim trailing
// newlines. Spec defines that EOF is treated as a newline, so we need to
// add a newline to the end of the segment if it is not empty.
//
// i.e.:
//
// ```go
// const test = "test"
//
// This code does not close the code block and ends with EOF. In this case,
// we need to add a newline to the end of the last line like `const test = "test"\n`.
ForceNewline bool
}
// NewSegment return a new Segment.
@ -40,12 +55,18 @@ func NewSegmentPadding(start, stop, n int) Segment {
// Value returns a value of the segment.
func (t *Segment) Value(buffer []byte) []byte {
var result []byte
if t.Padding == 0 {
return buffer[t.Start:t.Stop]
result = buffer[t.Start:t.Stop]
} else {
result = make([]byte, 0, t.Padding+t.Stop-t.Start+1)
result = append(result, bytes.Repeat(space, t.Padding)...)
result = append(result, buffer[t.Start:t.Stop]...)
}
result := make([]byte, 0, t.Padding+t.Stop-t.Start+1)
result = append(result, bytes.Repeat(space, t.Padding)...)
return append(result, buffer[t.Start:t.Stop]...)
if t.ForceNewline && len(result) > 0 && result[len(result)-1] != '\n' {
result = append(result, '\n')
}
return result
}
// Len returns a length of the segment.
@ -197,7 +218,7 @@ func (s *Segments) Sliced(lo, hi int) []Segment {
return s.values[lo:hi]
}
// Clear delete all element of the collction.
// Clear delete all element of the collection.
func (s *Segments) Clear() {
s.values = nil
}
@ -207,3 +228,12 @@ func (s *Segments) Unshift(v Segment) {
s.values = append(s.values[0:1], s.values[0:]...)
s.values[0] = v
}
// Value returns a string value of the collection.
func (s *Segments) Value(buffer []byte) []byte {
var result []byte
for _, v := range s.values {
result = append(result, v.Value(buffer)...)
}
return result
}

File diff suppressed because it is too large Load diff

1535
util/unicode_case_folding.go Normal file

File diff suppressed because it is too large Load diff

View file

@ -8,7 +8,7 @@ import (
"regexp"
"sort"
"strconv"
"strings"
"unicode"
"unicode/utf8"
)
@ -28,6 +28,7 @@ func NewCopyOnWriteBuffer(buffer []byte) CopyOnWriteBuffer {
}
// Write writes given bytes to the buffer.
// Write allocate new buffer and clears it at the first time.
func (b *CopyOnWriteBuffer) Write(value []byte) {
if !b.copied {
b.buffer = make([]byte, 0, len(b.buffer)+20)
@ -36,13 +37,51 @@ func (b *CopyOnWriteBuffer) Write(value []byte) {
b.buffer = append(b.buffer, value...)
}
// WriteString writes given string to the buffer.
// WriteString allocate new buffer and clears it at the first time.
func (b *CopyOnWriteBuffer) WriteString(value string) {
b.Write(StringToReadOnlyBytes(value))
}
// Append appends given bytes to the buffer.
// Append copy buffer at the first time.
func (b *CopyOnWriteBuffer) Append(value []byte) {
if !b.copied {
tmp := make([]byte, len(b.buffer), len(b.buffer)+20)
copy(tmp, b.buffer)
b.buffer = tmp
b.copied = true
}
b.buffer = append(b.buffer, value...)
}
// AppendString appends given string to the buffer.
// AppendString copy buffer at the first time.
func (b *CopyOnWriteBuffer) AppendString(value string) {
b.Append(StringToReadOnlyBytes(value))
}
// WriteByte writes the given byte to the buffer.
func (b *CopyOnWriteBuffer) WriteByte(c byte) {
// WriteByte allocate new buffer and clears it at the first time.
func (b *CopyOnWriteBuffer) WriteByte(c byte) error {
if !b.copied {
b.buffer = make([]byte, 0, len(b.buffer)+20)
b.copied = true
}
b.buffer = append(b.buffer, c)
return nil
}
// AppendByte appends given bytes to the buffer.
// AppendByte copy buffer at the first time.
func (b *CopyOnWriteBuffer) AppendByte(c byte) {
if !b.copied {
tmp := make([]byte, len(b.buffer), len(b.buffer)+20)
copy(tmp, b.buffer)
b.buffer = tmp
b.copied = true
}
b.buffer = append(b.buffer, c)
}
// Bytes returns bytes of this buffer.
@ -55,7 +94,7 @@ func (b *CopyOnWriteBuffer) IsCopied() bool {
return b.copied
}
// IsEscapedPunctuation returns true if caracter at a given index i
// IsEscapedPunctuation returns true if character at a given index i
// is an escaped punctuation, otherwise false.
func IsEscapedPunctuation(source []byte, i int) bool {
return source[i] == '\\' && i < len(source)-1 && IsPunct(source[i+1])
@ -91,7 +130,10 @@ func VisualizeSpaces(bs []byte) []byte {
bs = bytes.Replace(bs, []byte(" "), []byte("[SPACE]"), -1)
bs = bytes.Replace(bs, []byte("\t"), []byte("[TAB]"), -1)
bs = bytes.Replace(bs, []byte("\n"), []byte("[NEWLINE]\n"), -1)
bs = bytes.Replace(bs, []byte("\r"), []byte("[CR]\n"), -1)
bs = bytes.Replace(bs, []byte("\r"), []byte("[CR]"), -1)
bs = bytes.Replace(bs, []byte("\v"), []byte("[VTAB]"), -1)
bs = bytes.Replace(bs, []byte("\x00"), []byte("[NUL]"), -1)
bs = bytes.Replace(bs, []byte("\ufffd"), []byte("[U+FFFD]"), -1)
return bs
}
@ -104,37 +146,14 @@ func TabWidth(currentPos int) int {
// If the line contains tab characters, paddings may be not zero.
// currentPos==0 and width==2:
//
// position: 0 1
// [TAB]aaaa
// width: 1234 5678
// position: 0 1
// [TAB]aaaa
// width: 1234 5678
//
// width=2 is in the tab character. In this case, IndentPosition returns
// (pos=1, padding=2)
// (pos=1, padding=2).
func IndentPosition(bs []byte, currentPos, width int) (pos, padding int) {
if width == 0 {
return 0, 0
}
w := 0
l := len(bs)
i := 0
hasTab := false
for ; i < l; i++ {
if bs[i] == '\t' {
w += TabWidth(currentPos + w)
hasTab = true
} else if bs[i] == ' ' {
w++
} else {
break
}
}
if w >= width {
if !hasTab {
return width, 0
}
return i, w - width
}
return -1, -1
return IndentPositionPadding(bs, currentPos, 0, width)
}
// IndentPositionPadding searches an indent position with the given width for the given line.
@ -147,10 +166,16 @@ func IndentPositionPadding(bs []byte, currentPos, paddingv, width int) (pos, pad
w := 0
i := 0
l := len(bs)
p := paddingv
for ; i < l; i++ {
if bs[i] == '\t' {
if p > 0 {
p--
w++
continue
}
if bs[i] == '\t' && w < width {
w += TabWidth(currentPos + w)
} else if bs[i] == ' ' {
} else if bs[i] == ' ' && w < width {
w++
} else {
break
@ -163,6 +188,8 @@ func IndentPositionPadding(bs []byte, currentPos, paddingv, width int) (pos, pad
}
// DedentPosition dedents lines by the given width.
//
// Deprecated: This function has bugs. Use util.IndentPositionPadding and util.FirstNonSpacePosition.
func DedentPosition(bs []byte, currentPos, width int) (pos, padding int) {
if width == 0 {
return 0, 0
@ -188,6 +215,8 @@ func DedentPosition(bs []byte, currentPos, width int) (pos, padding int) {
// DedentPositionPadding dedents lines by the given width.
// This function is mostly same as DedentPosition except this function
// takes account into additional paddings.
//
// Deprecated: This function has bugs. Use util.IndentPositionPadding and util.FirstNonSpacePosition.
func DedentPositionPadding(bs []byte, currentPos, paddingv, width int) (pos, padding int) {
if width == 0 {
return 0, paddingv
@ -229,7 +258,7 @@ func IndentWidth(bs []byte, currentPos int) (width, pos int) {
return
}
// FirstNonSpacePosition returns a potisoin line that is a first nonspace
// FirstNonSpacePosition returns a position line that is a first nonspace
// character.
func FirstNonSpacePosition(bs []byte) int {
i := 0
@ -250,6 +279,10 @@ func FirstNonSpacePosition(bs []byte) int {
// If codeSpan is set true, it ignores characters in code spans.
// If allowNesting is set true, closures correspond to nested opener will be
// ignored.
//
// Deprecated: This function can not handle newlines. Many elements
// can be existed over multiple lines(e.g. link labels).
// Use text.Reader.FindClosure.
func FindClosure(bs []byte, opener, closure byte, codeSpan, allowNesting bool) int {
i := 0
opened := 1
@ -262,13 +295,14 @@ func FindClosure(bs []byte, opener, closure byte, codeSpan, allowNesting bool) i
if bs[i] == '`' {
codeSpanCloser++
} else {
i--
break
}
}
if codeSpanCloser == codeSpanOpener {
codeSpanOpener = 0
}
} else if c == '\\' && i < len(bs)-1 && IsPunct(bs[i+1]) {
} else if codeSpanOpener == 0 && c == '\\' && i < len(bs)-1 && IsPunct(bs[i+1]) {
i += 2
continue
} else if codeSpan && codeSpanOpener == 0 && c == '`' {
@ -276,6 +310,7 @@ func FindClosure(bs []byte, opener, closure byte, codeSpan, allowNesting bool) i
if bs[i] == '`' {
codeSpanOpener++
} else {
i--
break
}
}
@ -385,6 +420,52 @@ func TrimRightSpace(source []byte) []byte {
return TrimRight(source, spaces)
}
// DoFullUnicodeCaseFolding performs full unicode case folding to given bytes.
func DoFullUnicodeCaseFolding(v []byte) []byte {
var rbuf []byte
cob := NewCopyOnWriteBuffer(v)
n := 0
for i := 0; i < len(v); i++ {
c := v[i]
if c < 0xb5 {
if c >= 0x41 && c <= 0x5a {
// A-Z to a-z
cob.Write(v[n:i])
_ = cob.WriteByte(c + 32)
n = i + 1
}
continue
}
if !utf8.RuneStart(c) {
continue
}
r, length := utf8.DecodeRune(v[i:])
if r == utf8.RuneError {
continue
}
folded, ok := unicodeCaseFoldings[r]
if !ok {
continue
}
cob.Write(v[n:i])
if rbuf == nil {
rbuf = make([]byte, 4)
}
for _, f := range folded {
l := utf8.EncodeRune(rbuf, f)
cob.Write(rbuf[:l])
}
i += length - 1
n = i + 1
}
if cob.IsCopied() {
cob.Write(v[n:])
}
return cob.Bytes()
}
// ReplaceSpaces replaces sequence of spaces with the given repl.
func ReplaceSpaces(source []byte, repl byte) []byte {
var ret []byte
@ -437,16 +518,17 @@ func ToValidRune(v rune) rune {
return v
}
// ToLinkReference convert given bytes into a valid link reference string.
// ToLinkReference trims leading and trailing spaces and convert into lower
// ToLinkReference converts given bytes into a valid link reference string.
// ToLinkReference performs unicode case folding, trims leading and trailing spaces, converts into lower
// case and replace spaces with a single space character.
func ToLinkReference(v []byte) string {
v = TrimLeftSpace(v)
v = TrimRightSpace(v)
return strings.ToLower(string(ReplaceSpaces(v, ' ')))
v = DoFullUnicodeCaseFolding(v)
return string(ReplaceSpaces(v, ' '))
}
var htmlEscapeTable = [256][]byte{nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&quot;"), nil, nil, nil, []byte("&amp;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&lt;"), nil, []byte("&gt;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil}
var htmlEscapeTable = [256][]byte{nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&quot;"), nil, nil, nil, []byte("&amp;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&lt;"), nil, []byte("&gt;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil} //nolint:golint,lll
// EscapeHTMLByte returns HTML escaped bytes if the given byte should be escaped,
// otherwise nil.
@ -482,7 +564,7 @@ func UnescapePunctuations(source []byte) []byte {
c := source[i]
if i < limit-1 && c == '\\' && IsPunct(source[i+1]) {
cob.Write(source[n:i])
cob.WriteByte(source[i+1])
_ = cob.WriteByte(source[i+1])
i += 2
n = i
continue
@ -498,9 +580,9 @@ func UnescapePunctuations(source []byte) []byte {
// ResolveNumericReferences resolve numeric references like '&#1234;" .
func ResolveNumericReferences(source []byte) []byte {
cob := NewCopyOnWriteBuffer(source)
buf := make([]byte, 6, 6)
buf := make([]byte, 6)
limit := len(source)
ok := false
var ok bool
n := 0
for i := 0; i < limit; i++ {
if source[i] == '&' {
@ -550,7 +632,7 @@ func ResolveNumericReferences(source []byte) []byte {
func ResolveEntityNames(source []byte) []byte {
cob := NewCopyOnWriteBuffer(source)
limit := len(source)
ok := false
var ok bool
n := 0
for i := 0; i < limit; i++ {
if source[i] == '&' {
@ -583,11 +665,11 @@ var htmlSpace = []byte("%20")
// URLEscape escape the given URL.
// If resolveReference is set true:
// 1. unescape punctuations
// 2. resolve numeric references
// 3. resolve entity references
// 1. unescape punctuations
// 2. resolve numeric references
// 3. resolve entity references
//
// URL encoded values (%xx) are keeped as is.
// URL encoded values (%xx) are kept as is.
func URLEscape(v []byte, resolveReference bool) []byte {
if resolveReference {
v = UnescapePunctuations(v)
@ -620,8 +702,22 @@ func URLEscape(v []byte, resolveReference bool) []byte {
n = i
continue
}
if int(u8len) > len(v) {
u8len = int8(len(v) - 1)
}
if u8len == 0 {
i++
n = i
continue
}
cob.Write(v[n:i])
cob.Write(StringToReadOnlyBytes(url.QueryEscape(string(v[i : i+int(u8len)]))))
stop := i + int(u8len)
if stop > len(v) {
i++
n = i
continue
}
cob.Write(StringToReadOnlyBytes(url.QueryEscape(string(v[i:stop]))))
i += int(u8len)
n = i
}
@ -661,7 +757,7 @@ func FindURLIndex(b []byte) int {
return i
}
var emailDomainRegexp = regexp.MustCompile(`^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*`)
var emailDomainRegexp = regexp.MustCompile(`^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*`) //nolint:golint,lll
// FindEmailIndex returns a stop index value if the given bytes seem an email address.
func FindEmailIndex(b []byte) int {
@ -692,18 +788,19 @@ func FindEmailIndex(b []byte) int {
var spaces = []byte(" \t\n\x0b\x0c\x0d")
var spaceTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
var spaceTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
var punctTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
var punctTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
// a-zA-Z0-9, ;/?:@&=+$,-_.!~*'()#
var urlEscapeTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
var utf8lenTable = [256]int8{1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 99, 99, 99, 99, 99, 99, 99, 99}
var urlEscapeTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
var urlTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 5, 5, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 0, 1, 0, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}
var utf8lenTable = [256]int8{1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 99, 99, 99, 99, 99, 99, 99, 99} //nolint:golint,lll
var emailTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
var urlTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 5, 5, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 0, 1, 0, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1} //nolint:golint,lll
var emailTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
// UTF8Len returns a byte length of the utf-8 character.
func UTF8Len(b byte) int8 {
@ -715,11 +812,21 @@ func IsPunct(c byte) bool {
return punctTable[c] == 1
}
// IsPunctRune returns true if the given rune is a punctuation, otherwise false.
func IsPunctRune(r rune) bool {
return unicode.IsSymbol(r) || unicode.IsPunct(r)
}
// IsSpace returns true if the given character is a space, otherwise false.
func IsSpace(c byte) bool {
return spaceTable[c] == 1
}
// IsSpaceRune returns true if the given rune is a space, otherwise false.
func IsSpaceRune(r rune) bool {
return int32(r) <= 256 && IsSpace(byte(r)) || unicode.IsSpace(r)
}
// IsNumeric returns true if the given character is a numeric, otherwise false.
func IsNumeric(c byte) bool {
return c >= '0' && c <= '9'
@ -754,7 +861,7 @@ type PrioritizedValue struct {
Priority int
}
// PrioritizedSlice is a slice of the PrioritizedValues
// PrioritizedSlice is a slice of the PrioritizedValues.
type PrioritizedSlice []PrioritizedValue
// Sort sorts the PrioritizedSlice in ascending order.
@ -784,3 +891,98 @@ func (s PrioritizedSlice) Remove(v interface{}) PrioritizedSlice {
func Prioritized(v interface{}, priority int) PrioritizedValue {
return PrioritizedValue{v, priority}
}
func bytesHash(b []byte) uint64 {
var hash uint64 = 5381
for _, c := range b {
hash = ((hash << 5) + hash) + uint64(c)
}
return hash
}
// BytesFilter is a efficient data structure for checking whether bytes exist or not.
// BytesFilter is thread-safe.
type BytesFilter interface {
// Add adds given bytes to this set.
Add([]byte)
// Contains return true if this set contains given bytes, otherwise false.
Contains([]byte) bool
// Extend copies this filter and adds given bytes to new filter.
Extend(...[]byte) BytesFilter
}
type bytesFilter struct {
chars [256]uint8
threshold int
slots [][][]byte
}
// NewBytesFilter returns a new BytesFilter.
func NewBytesFilter(elements ...[]byte) BytesFilter {
s := &bytesFilter{
threshold: 3,
slots: make([][][]byte, 64),
}
for _, element := range elements {
s.Add(element)
}
return s
}
func (s *bytesFilter) Add(b []byte) {
l := len(b)
m := s.threshold
if l < s.threshold {
m = l
}
for i := 0; i < m; i++ {
s.chars[b[i]] |= 1 << uint8(i)
}
h := bytesHash(b) % uint64(len(s.slots))
slot := s.slots[h]
if slot == nil {
slot = [][]byte{}
}
s.slots[h] = append(slot, b)
}
func (s *bytesFilter) Extend(bs ...[]byte) BytesFilter {
newFilter := NewBytesFilter().(*bytesFilter)
newFilter.chars = s.chars
newFilter.threshold = s.threshold
for k, v := range s.slots {
newSlot := make([][]byte, len(v))
copy(newSlot, v)
newFilter.slots[k] = v
}
for _, b := range bs {
newFilter.Add(b)
}
return newFilter
}
func (s *bytesFilter) Contains(b []byte) bool {
l := len(b)
m := s.threshold
if l < s.threshold {
m = l
}
for i := 0; i < m; i++ {
if (s.chars[b[i]] & (1 << uint8(i))) == 0 {
return false
}
}
h := bytesHash(b) % uint64(len(s.slots))
slot := s.slots[h]
if len(slot) == 0 {
return false
}
for _, element := range slot {
if bytes.Equal(element, b) {
return true
}
}
return false
}

469
util/util_cjk.go Normal file
View file

@ -0,0 +1,469 @@
package util
import "unicode"
var cjkRadicalsSupplement = &unicode.RangeTable{
R16: []unicode.Range16{
{0x2E80, 0x2EFF, 1},
},
}
var kangxiRadicals = &unicode.RangeTable{
R16: []unicode.Range16{
{0x2F00, 0x2FDF, 1},
},
}
var ideographicDescriptionCharacters = &unicode.RangeTable{
R16: []unicode.Range16{
{0x2FF0, 0x2FFF, 1},
},
}
var cjkSymbolsAndPunctuation = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3000, 0x303F, 1},
},
}
var hiragana = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3040, 0x309F, 1},
},
}
var katakana = &unicode.RangeTable{
R16: []unicode.Range16{
{0x30A0, 0x30FF, 1},
},
}
var kanbun = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3130, 0x318F, 1},
{0x3190, 0x319F, 1},
},
}
var cjkStrokes = &unicode.RangeTable{
R16: []unicode.Range16{
{0x31C0, 0x31EF, 1},
},
}
var katakanaPhoneticExtensions = &unicode.RangeTable{
R16: []unicode.Range16{
{0x31F0, 0x31FF, 1},
},
}
var cjkCompatibility = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3300, 0x33FF, 1},
},
}
var cjkUnifiedIdeographsExtensionA = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3400, 0x4DBF, 1},
},
}
var cjkUnifiedIdeographs = &unicode.RangeTable{
R16: []unicode.Range16{
{0x4E00, 0x9FFF, 1},
},
}
var yiSyllables = &unicode.RangeTable{
R16: []unicode.Range16{
{0xA000, 0xA48F, 1},
},
}
var yiRadicals = &unicode.RangeTable{
R16: []unicode.Range16{
{0xA490, 0xA4CF, 1},
},
}
var cjkCompatibilityIdeographs = &unicode.RangeTable{
R16: []unicode.Range16{
{0xF900, 0xFAFF, 1},
},
}
var verticalForms = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFE10, 0xFE1F, 1},
},
}
var cjkCompatibilityForms = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFE30, 0xFE4F, 1},
},
}
var smallFormVariants = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFE50, 0xFE6F, 1},
},
}
var halfwidthAndFullwidthForms = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFF00, 0xFFEF, 1},
},
}
var kanaSupplement = &unicode.RangeTable{
R32: []unicode.Range32{
{0x1B000, 0x1B0FF, 1},
},
}
var kanaExtendedA = &unicode.RangeTable{
R32: []unicode.Range32{
{0x1B100, 0x1B12F, 1},
},
}
var smallKanaExtension = &unicode.RangeTable{
R32: []unicode.Range32{
{0x1B130, 0x1B16F, 1},
},
}
var cjkUnifiedIdeographsExtensionB = &unicode.RangeTable{
R32: []unicode.Range32{
{0x20000, 0x2A6DF, 1},
},
}
var cjkUnifiedIdeographsExtensionC = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2A700, 0x2B73F, 1},
},
}
var cjkUnifiedIdeographsExtensionD = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2B740, 0x2B81F, 1},
},
}
var cjkUnifiedIdeographsExtensionE = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2B820, 0x2CEAF, 1},
},
}
var cjkUnifiedIdeographsExtensionF = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2CEB0, 0x2EBEF, 1},
},
}
var cjkCompatibilityIdeographsSupplement = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2F800, 0x2FA1F, 1},
},
}
var cjkUnifiedIdeographsExtensionG = &unicode.RangeTable{
R32: []unicode.Range32{
{0x30000, 0x3134F, 1},
},
}
// IsEastAsianWideRune returns trhe if the given rune is an east asian wide character, otherwise false.
func IsEastAsianWideRune(r rune) bool {
return unicode.Is(unicode.Hiragana, r) ||
unicode.Is(unicode.Katakana, r) ||
unicode.Is(unicode.Han, r) ||
unicode.Is(unicode.Lm, r) ||
unicode.Is(unicode.Hangul, r) ||
unicode.Is(cjkSymbolsAndPunctuation, r)
}
// IsSpaceDiscardingUnicodeRune returns true if the given rune is space-discarding unicode character, otherwise false.
// See https://www.w3.org/TR/2020/WD-css-text-3-20200429/#space-discard-set
func IsSpaceDiscardingUnicodeRune(r rune) bool {
return unicode.Is(cjkRadicalsSupplement, r) ||
unicode.Is(kangxiRadicals, r) ||
unicode.Is(ideographicDescriptionCharacters, r) ||
unicode.Is(cjkSymbolsAndPunctuation, r) ||
unicode.Is(hiragana, r) ||
unicode.Is(katakana, r) ||
unicode.Is(kanbun, r) ||
unicode.Is(cjkStrokes, r) ||
unicode.Is(katakanaPhoneticExtensions, r) ||
unicode.Is(cjkCompatibility, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionA, r) ||
unicode.Is(cjkUnifiedIdeographs, r) ||
unicode.Is(yiSyllables, r) ||
unicode.Is(yiRadicals, r) ||
unicode.Is(cjkCompatibilityIdeographs, r) ||
unicode.Is(verticalForms, r) ||
unicode.Is(cjkCompatibilityForms, r) ||
unicode.Is(smallFormVariants, r) ||
unicode.Is(halfwidthAndFullwidthForms, r) ||
unicode.Is(kanaSupplement, r) ||
unicode.Is(kanaExtendedA, r) ||
unicode.Is(smallKanaExtension, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionB, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionC, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionD, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionE, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionF, r) ||
unicode.Is(cjkCompatibilityIdeographsSupplement, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionG, r)
}
// EastAsianWidth returns the east asian width of the given rune.
// See https://www.unicode.org/reports/tr11/tr11-36.html
func EastAsianWidth(r rune) string {
switch {
case r == 0x3000,
(0xFF01 <= r && r <= 0xFF60),
(0xFFE0 <= r && r <= 0xFFE6):
return "F"
case r == 0x20A9,
(0xFF61 <= r && r <= 0xFFBE),
(0xFFC2 <= r && r <= 0xFFC7),
(0xFFCA <= r && r <= 0xFFCF),
(0xFFD2 <= r && r <= 0xFFD7),
(0xFFDA <= r && r <= 0xFFDC),
(0xFFE8 <= r && r <= 0xFFEE):
return "H"
case (0x1100 <= r && r <= 0x115F),
(0x11A3 <= r && r <= 0x11A7),
(0x11FA <= r && r <= 0x11FF),
(0x2329 <= r && r <= 0x232A),
(0x2E80 <= r && r <= 0x2E99),
(0x2E9B <= r && r <= 0x2EF3),
(0x2F00 <= r && r <= 0x2FD5),
(0x2FF0 <= r && r <= 0x2FFB),
(0x3001 <= r && r <= 0x303E),
(0x3041 <= r && r <= 0x3096),
(0x3099 <= r && r <= 0x30FF),
(0x3105 <= r && r <= 0x312D),
(0x3131 <= r && r <= 0x318E),
(0x3190 <= r && r <= 0x31BA),
(0x31C0 <= r && r <= 0x31E3),
(0x31F0 <= r && r <= 0x321E),
(0x3220 <= r && r <= 0x3247),
(0x3250 <= r && r <= 0x32FE),
(0x3300 <= r && r <= 0x4DBF),
(0x4E00 <= r && r <= 0xA48C),
(0xA490 <= r && r <= 0xA4C6),
(0xA960 <= r && r <= 0xA97C),
(0xAC00 <= r && r <= 0xD7A3),
(0xD7B0 <= r && r <= 0xD7C6),
(0xD7CB <= r && r <= 0xD7FB),
(0xF900 <= r && r <= 0xFAFF),
(0xFE10 <= r && r <= 0xFE19),
(0xFE30 <= r && r <= 0xFE52),
(0xFE54 <= r && r <= 0xFE66),
(0xFE68 <= r && r <= 0xFE6B),
(0x1B000 <= r && r <= 0x1B001),
(0x1F200 <= r && r <= 0x1F202),
(0x1F210 <= r && r <= 0x1F23A),
(0x1F240 <= r && r <= 0x1F248),
(0x1F250 <= r && r <= 0x1F251),
(0x20000 <= r && r <= 0x2F73F),
(0x2B740 <= r && r <= 0x2FFFD),
(0x30000 <= r && r <= 0x3FFFD):
return "W"
case (0x0020 <= r && r <= 0x007E),
(0x00A2 <= r && r <= 0x00A3),
(0x00A5 <= r && r <= 0x00A6),
r == 0x00AC,
r == 0x00AF,
(0x27E6 <= r && r <= 0x27ED),
(0x2985 <= r && r <= 0x2986):
return "Na"
case (0x00A1 == r),
(0x00A4 == r),
(0x00A7 <= r && r <= 0x00A8),
(0x00AA == r),
(0x00AD <= r && r <= 0x00AE),
(0x00B0 <= r && r <= 0x00B4),
(0x00B6 <= r && r <= 0x00BA),
(0x00BC <= r && r <= 0x00BF),
(0x00C6 == r),
(0x00D0 == r),
(0x00D7 <= r && r <= 0x00D8),
(0x00DE <= r && r <= 0x00E1),
(0x00E6 == r),
(0x00E8 <= r && r <= 0x00EA),
(0x00EC <= r && r <= 0x00ED),
(0x00F0 == r),
(0x00F2 <= r && r <= 0x00F3),
(0x00F7 <= r && r <= 0x00FA),
(0x00FC == r),
(0x00FE == r),
(0x0101 == r),
(0x0111 == r),
(0x0113 == r),
(0x011B == r),
(0x0126 <= r && r <= 0x0127),
(0x012B == r),
(0x0131 <= r && r <= 0x0133),
(0x0138 == r),
(0x013F <= r && r <= 0x0142),
(0x0144 == r),
(0x0148 <= r && r <= 0x014B),
(0x014D == r),
(0x0152 <= r && r <= 0x0153),
(0x0166 <= r && r <= 0x0167),
(0x016B == r),
(0x01CE == r),
(0x01D0 == r),
(0x01D2 == r),
(0x01D4 == r),
(0x01D6 == r),
(0x01D8 == r),
(0x01DA == r),
(0x01DC == r),
(0x0251 == r),
(0x0261 == r),
(0x02C4 == r),
(0x02C7 == r),
(0x02C9 <= r && r <= 0x02CB),
(0x02CD == r),
(0x02D0 == r),
(0x02D8 <= r && r <= 0x02DB),
(0x02DD == r),
(0x02DF == r),
(0x0300 <= r && r <= 0x036F),
(0x0391 <= r && r <= 0x03A1),
(0x03A3 <= r && r <= 0x03A9),
(0x03B1 <= r && r <= 0x03C1),
(0x03C3 <= r && r <= 0x03C9),
(0x0401 == r),
(0x0410 <= r && r <= 0x044F),
(0x0451 == r),
(0x2010 == r),
(0x2013 <= r && r <= 0x2016),
(0x2018 <= r && r <= 0x2019),
(0x201C <= r && r <= 0x201D),
(0x2020 <= r && r <= 0x2022),
(0x2024 <= r && r <= 0x2027),
(0x2030 == r),
(0x2032 <= r && r <= 0x2033),
(0x2035 == r),
(0x203B == r),
(0x203E == r),
(0x2074 == r),
(0x207F == r),
(0x2081 <= r && r <= 0x2084),
(0x20AC == r),
(0x2103 == r),
(0x2105 == r),
(0x2109 == r),
(0x2113 == r),
(0x2116 == r),
(0x2121 <= r && r <= 0x2122),
(0x2126 == r),
(0x212B == r),
(0x2153 <= r && r <= 0x2154),
(0x215B <= r && r <= 0x215E),
(0x2160 <= r && r <= 0x216B),
(0x2170 <= r && r <= 0x2179),
(0x2189 == r),
(0x2190 <= r && r <= 0x2199),
(0x21B8 <= r && r <= 0x21B9),
(0x21D2 == r),
(0x21D4 == r),
(0x21E7 == r),
(0x2200 == r),
(0x2202 <= r && r <= 0x2203),
(0x2207 <= r && r <= 0x2208),
(0x220B == r),
(0x220F == r),
(0x2211 == r),
(0x2215 == r),
(0x221A == r),
(0x221D <= r && r <= 0x2220),
(0x2223 == r),
(0x2225 == r),
(0x2227 <= r && r <= 0x222C),
(0x222E == r),
(0x2234 <= r && r <= 0x2237),
(0x223C <= r && r <= 0x223D),
(0x2248 == r),
(0x224C == r),
(0x2252 == r),
(0x2260 <= r && r <= 0x2261),
(0x2264 <= r && r <= 0x2267),
(0x226A <= r && r <= 0x226B),
(0x226E <= r && r <= 0x226F),
(0x2282 <= r && r <= 0x2283),
(0x2286 <= r && r <= 0x2287),
(0x2295 == r),
(0x2299 == r),
(0x22A5 == r),
(0x22BF == r),
(0x2312 == r),
(0x2460 <= r && r <= 0x24E9),
(0x24EB <= r && r <= 0x254B),
(0x2550 <= r && r <= 0x2573),
(0x2580 <= r && r <= 0x258F),
(0x2592 <= r && r <= 0x2595),
(0x25A0 <= r && r <= 0x25A1),
(0x25A3 <= r && r <= 0x25A9),
(0x25B2 <= r && r <= 0x25B3),
(0x25B6 <= r && r <= 0x25B7),
(0x25BC <= r && r <= 0x25BD),
(0x25C0 <= r && r <= 0x25C1),
(0x25C6 <= r && r <= 0x25C8),
(0x25CB == r),
(0x25CE <= r && r <= 0x25D1),
(0x25E2 <= r && r <= 0x25E5),
(0x25EF == r),
(0x2605 <= r && r <= 0x2606),
(0x2609 == r),
(0x260E <= r && r <= 0x260F),
(0x2614 <= r && r <= 0x2615),
(0x261C == r),
(0x261E == r),
(0x2640 == r),
(0x2642 == r),
(0x2660 <= r && r <= 0x2661),
(0x2663 <= r && r <= 0x2665),
(0x2667 <= r && r <= 0x266A),
(0x266C <= r && r <= 0x266D),
(0x266F == r),
(0x269E <= r && r <= 0x269F),
(0x26BE <= r && r <= 0x26BF),
(0x26C4 <= r && r <= 0x26CD),
(0x26CF <= r && r <= 0x26E1),
(0x26E3 == r),
(0x26E8 <= r && r <= 0x26FF),
(0x273D == r),
(0x2757 == r),
(0x2776 <= r && r <= 0x277F),
(0x2B55 <= r && r <= 0x2B59),
(0x3248 <= r && r <= 0x324F),
(0xE000 <= r && r <= 0xF8FF),
(0xFE00 <= r && r <= 0xFE0F),
(0xFFFD == r),
(0x1F100 <= r && r <= 0x1F10A),
(0x1F110 <= r && r <= 0x1F12D),
(0x1F130 <= r && r <= 0x1F169),
(0x1F170 <= r && r <= 0x1F19A),
(0xE0100 <= r && r <= 0xE01EF),
(0xF0000 <= r && r <= 0xFFFFD),
(0x100000 <= r && r <= 0x10FFFD):
return "A"
default:
return "N"
}
}

View file

@ -1,4 +1,5 @@
// +build appengine,js
//go:build appengine || js
// +build appengine js
package util

View file

@ -1,4 +1,5 @@
// +build !appengine,!js
//go:build !appengine && !js && !go1.21
// +build !appengine,!js,!go1.21
package util
@ -13,8 +14,11 @@ func BytesToReadOnlyString(b []byte) string {
}
// StringToReadOnlyBytes returns bytes converted from given string.
func StringToReadOnlyBytes(s string) []byte {
func StringToReadOnlyBytes(s string) (bs []byte) {
sh := (*reflect.StringHeader)(unsafe.Pointer(&s))
bh := reflect.SliceHeader{Data: sh.Data, Len: sh.Len, Cap: sh.Len}
return *(*[]byte)(unsafe.Pointer(&bh))
bh := (*reflect.SliceHeader)(unsafe.Pointer(&bs))
bh.Data = sh.Data
bh.Cap = sh.Len
bh.Len = sh.Len
return
}

18
util/util_unsafe_go121.go Normal file
View file

@ -0,0 +1,18 @@
//go:build !appengine && !js && go1.21
// +build !appengine,!js,go1.21
package util
import (
"unsafe"
)
// BytesToReadOnlyString returns a string converted from given bytes.
func BytesToReadOnlyString(b []byte) string {
return unsafe.String(unsafe.SliceData(b), len(b))
}
// StringToReadOnlyBytes returns bytes converted from given string.
func StringToReadOnlyBytes(s string) []byte {
return unsafe.Slice(unsafe.StringData(s), len(s))
}