Compare commits

...

59 commits

Author SHA1 Message Date
Yusuke Inuzuka
04410ff159
Merge pull request #487 from n-peugnet/patch-1
Fix GitHub actions badge URL
2025-02-19 03:38:00 +09:00
Yusuke Inuzuka
f7b2d24c09
Merge pull request #489 from lyricat/master
chore: update `goldmark-enclave` repo URL and information
2025-02-19 03:37:26 +09:00
Lyric Wai
ba49c5c69d
Merge pull request #1 from lyricat/lyricat-patch-enclave
Update README.md
2025-02-18 23:34:09 +09:00
Lyric Wai
c05fb087a4
Update README.md
Update `goldmark-enclave` url and information
2025-02-18 23:33:14 +09:00
Nicolas Peugnet
b39daae79e
Fix GitHub actions badge URL 2025-02-02 15:58:36 +01:00
yuin
d9c03f07f0 Deprecate Node.Text
Node.Text was intended to get a text value from some inline nodes.
A 'text value' of a Text node is clear.

But

- BaseNode had a default implementation of Node.Text
- Lacks of GoDoc description that Node.Text is valid only for
  some inline nodes

So, some users are using Node.Text for BlockNodes.

A 'text value' for a BlockNode is not clear.

e.g. : Text value of a ListNode

- It should be contains list markers?
- What do characters concatinate List items with? newlines? spaces?
- If it contains codeblocks, codeblocks should be fenced or indented?

Now we would like to avoid such ambiguous method.
2024-10-16 20:47:35 +09:00
yuin
65dcf6cd0a Add warning to Node.Text GoDoc 2024-10-15 19:22:00 +09:00
yuin
ad1565131a Fix #470 2024-10-15 19:19:41 +09:00
yuin
bc993b4f59 Fix testcases 2024-10-12 23:12:47 +09:00
yuin
41273a4d07 Fix EOF rendering 2024-10-12 23:05:37 +09:00
Yusuke Inuzuka
d80ac9397c
Merge pull request #462 from Andrew-Morozko/table_fix
Fix panic in table parser
2024-10-12 22:46:35 +09:00
yuin
15000ac6a1 Fix lint errors 2024-10-12 22:42:46 +09:00
Yusuke Inuzuka
14d91f957f
Merge pull request #432 from dr2chase/master
update unsafe code to user newer-faster-better idioms
2024-10-12 22:42:18 +09:00
yuin
3847ca20c6 lazy initialize html5entities 2024-10-12 22:37:57 +09:00
yuin
697e44ce88 Fix #464, Fix #465 2024-10-12 22:18:27 +09:00
yuin
fa88006eee Fix #466 2024-10-12 19:17:05 +09:00
Yusuke Inuzuka
fe34ea5d96
Merge pull request #460 from cbednarski/b-ast-block-text
Add Text() retrieval for BaseBlock types
2024-10-12 18:24:34 +09:00
Andrew Morozko
fd14edc9bc
Fix panic in table parser 2024-08-15 23:38:43 +04:00
Chris Bednarski
e367755421 Added test for BaseBlock.Text retrieval 2024-07-25 23:40:45 -07:00
Chris Bednarski
e44645afbb Implement Text interface for BaseBlock
This implementation was missing, making it impossible to retrieve Text
from block types, such as CodeBlock and FencedCodeBlock, via the ast
interface.
2024-07-25 22:37:54 -07:00
yuin
15ade8aace Fixes #457 2024-06-25 23:29:29 +09:00
yuin
dc32f35808 Fix lint errors 2024-06-23 22:09:11 +09:00
yuin
25bdeb0fee Fixes #456 2024-06-23 21:46:17 +09:00
yuin
a590622b15 Fixes #455 2024-06-14 22:05:13 +09:00
Yusuke Inuzuka
fde4948b4d
Merge pull request #455 from camdencheek/support-single-tilde-strikethrough
GitHub flavored markdown: support single-tilde strikethrough
2024-06-14 21:26:16 +09:00
Camden Cheek
9c09ae0019
support single-tilde strikethrough 2024-06-11 11:16:22 -06:00
Yusuke Inuzuka
c15e394c27
Merge pull request #448 from movsb/fix-attribute-string
make RenderAttributes() accept both []byte and string
2024-04-03 18:46:06 +09:00
movsb
e405d57be0 make SetAttributeString() accept both []byte and string 2024-04-02 20:00:37 +08:00
Yusuke Inuzuka
ce6424aa0e
Merge pull request #446 from mr-chelyshkin/goldmark-tgmd
add link to goldmark-tgmd renderer
2024-03-24 21:08:58 +09:00
mr-chelyshkin
09afa2feba add link to goldmark-tgmd renderer 2024-03-23 08:12:08 +02:00
Yusuke Inuzuka
4f3074451e
Merge pull request #443 from philipparndt/patch-1
docu: update example as it will not build
2024-02-29 14:35:50 +09:00
Philipp Arndt
4675c66d3d
docu: update example as it will not build 2024-02-19 07:56:40 +01:00
yuin
848dc66530 Add playground link 2024-02-03 20:12:14 +09:00
yuin
b8d6d3a9b7 Bump up CommonMark Spec to 0.31.2 2024-02-02 21:13:09 +09:00
yuin
90c46e0829 Remove io/ioutil s 2024-01-23 22:45:14 +09:00
yuin
4bade05173 Drop Go1.18 support 2024-01-23 22:41:12 +09:00
David Chase
2b845f2615 update unsafe code to user newer-faster-better idioms 2023-11-28 11:16:10 -05:00
Yusuke Inuzuka
e3d8ed9725
Merge pull request #429 from movsb/extension-wiki-table
Add extension: wiki-table
2023-11-21 19:16:40 +09:00
movsb
697cd509b1 Add extension: wiki-table 2023-11-21 08:34:34 +08:00
Yusuke Inuzuka
ff3285aa2a
Merge pull request #427 from lyricat/patch-1
README/extensions: Add goldmark-enclave
2023-11-20 18:09:01 +09:00
Lyric Wai
c2167685c1
Add an extension, goldmark-enclave 2023-11-17 19:47:10 +09:00
yuin
39a50c623e Add goldmark-dynamic 2023-11-03 21:27:34 +09:00
yuin
9c9003363f Simplify EastAsianLineBreaks 2023-10-28 17:57:55 +09:00
Yusuke Inuzuka
a89ad04c49
Merge pull request #411 from henry0312/update_cond_east_asian_line_breaks
Define line break styles for east asian characters as options
2023-10-28 17:27:21 +09:00
OMOTO Tsukasa
6b3067e7e7 Implements CSS3Draft 2023-10-24 21:54:35 +09:00
yuin
6442ae1259 Fix #416 2023-10-14 18:02:09 +09:00
Yusuke Inuzuka
68e53654f2
Merge pull request #419 from roife/master
Fix #418
2023-10-08 21:43:29 +09:00
roife
04d4dd50ab Fix #418 2023-09-29 02:41:10 +08:00
OMOTO Tsukasa
8c6830d73b fix errors of lints 2023-09-24 15:07:17 +09:00
OMOTO Tsukasa
792af6819e Updat README.md 2023-09-24 14:25:34 +09:00
OMOTO Tsukasa
9d0b1b6bb8 Define EastAsianLineBreaksStyle to specify behavior of line breaking 2023-09-24 14:25:28 +09:00
OMOTO Tsukasa
dc2230c235 fix tests 2023-09-10 18:48:44 +09:00
OMOTO Tsukasa
2367b9ff46 add comments 2023-09-10 15:17:16 +09:00
OMOTO Tsukasa
6cbcfebb71 Add a WorksEvenWithOneSide option to EastAsianLineBreak 2023-09-10 15:08:57 +09:00
OMOTO Tsukasa
6ef9b10a3a Improve line breaking behavior for east asian characters
This commit aims to produce more natural line breaks in the rendered output.
2023-08-27 15:13:49 +09:00
yuin
d39ab8f93e Runs linters only if on the linux 2023-08-15 18:53:16 +09:00
yuin
9b02182dd0 Apply linters 2023-08-15 18:40:41 +09:00
Yusuke Inuzuka
ac56543632
Merge pull request #409 from henry0312/support_cjk_symbols_and_punctuation
Add support for CJK Symbols and Punctuation
2023-08-13 22:26:44 +09:00
OMOTO Tsukasa
2f1b40d881 Support CJK Symbols and Punctuation
This commit adds support of CJK Symbols and Punctuation to `func IsEastAsianWideRune`
2023-08-13 13:45:19 +09:00
61 changed files with 7086 additions and 5582 deletions

View file

@ -5,22 +5,29 @@ jobs:
strategy: strategy:
fail-fast: false fail-fast: false
matrix: matrix:
go-version: [1.18.x, 1.19.x] go-version: [1.21.x, 1.22.x]
platform: [ubuntu-latest, macos-latest, windows-latest] platform: [ubuntu-latest, macos-latest, windows-latest]
runs-on: ${{ matrix.platform }} runs-on: ${{ matrix.platform }}
steps: steps:
- name: Install Go - name: Install Go
uses: actions/setup-go@v1 uses: actions/setup-go@v4
with: with:
go-version: ${{ matrix.go-version }} go-version: ${{ matrix.go-version }}
- name: Checkout code - name: Checkout code
uses: actions/checkout@v1 uses: actions/checkout@v3
- name: Run lints
uses: golangci/golangci-lint-action@v6
with:
version: latest
if: "matrix.platform == 'ubuntu-latest'" # gofmt linter fails on Windows for CRLF problems
- name: Run tests - name: Run tests
env:
GOLDMARK_TEST_TIMEOUT_MULTIPLIER: 5
run: go test -v ./... -covermode=count -coverprofile=coverage.out -coverpkg=./... run: go test -v ./... -covermode=count -coverprofile=coverage.out -coverpkg=./...
- name: Install goveralls
run: go install github.com/mattn/goveralls@latest
- name: Send coverage - name: Send coverage
if: "matrix.platform == 'ubuntu-latest'" if: "matrix.platform == 'ubuntu-latest'"
env: env:
COVERALLS_TOKEN: ${{ secrets.GITHUB_TOKEN }} COVERALLS_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: | run: goveralls -coverprofile=coverage.out -service=github
GO111MODULE=off go get github.com/mattn/goveralls
$(go env GOPATH)/bin/goveralls -coverprofile=coverage.out -service=github

105
.golangci.yml Normal file
View file

@ -0,0 +1,105 @@
run:
deadline: 10m
issues:
exclude-use-default: false
exclude-rules:
- path: _test.go
linters:
- errcheck
- lll
exclude:
- "Package util"
linters:
disable-all: true
enable:
- errcheck
- gosimple
- govet
- ineffassign
- staticcheck
- typecheck
- unused
- gofmt
- godot
- makezero
- misspell
- revive
- wastedassign
- lll
linters-settings:
revive:
severity: "warning"
confidence: 0.8
rules:
- name: blank-imports
severity: warning
disabled: false
- name: context-as-argument
severity: warning
disabled: false
- name: context-keys-type
severity: warning
disabled: false
- name: dot-imports
severity: warning
disabled: true
- name: error-return
severity: warning
disabled: false
- name: error-strings
severity: warning
disabled: false
- name: error-naming
severity: warning
disabled: false
- name: exported
severity: warning
disabled: false
- name: increment-decrement
severity: warning
disabled: false
- name: var-naming
severity: warning
disabled: false
- name: var-declaration
severity: warning
disabled: false
- name: package-comments
severity: warning
disabled: false
- name: range
severity: warning
disabled: false
- name: receiver-naming
severity: warning
disabled: false
- name: time-naming
severity: warning
disabled: false
- name: unexported-return
severity: warning
disabled: false
- name: indent-error-flow
severity: warning
disabled: false
- name: errorf
severity: warning
disabled: false
- name: empty-block
severity: warning
disabled: true
- name: superfluous-else
severity: warning
disabled: false
- name: unused-parameter
severity: warning
disabled: true
- name: unreachable-code
severity: warning
disabled: false
- name: redefines-builtin-id
severity: warning
disabled: false

View file

@ -1,4 +1,7 @@
.PHONY: test fuzz .PHONY: test fuzz lint
lint:
golangci-lint run -c .golangci.yml ./...
test: test:
go test -coverprofile=profile.out -coverpkg=github.com/yuin/goldmark,github.com/yuin/goldmark/ast,github.com/yuin/goldmark/extension,github.com/yuin/goldmark/extension/ast,github.com/yuin/goldmark/parser,github.com/yuin/goldmark/renderer,github.com/yuin/goldmark/renderer/html,github.com/yuin/goldmark/text,github.com/yuin/goldmark/util ./... go test -coverprofile=profile.out -coverpkg=github.com/yuin/goldmark,github.com/yuin/goldmark/ast,github.com/yuin/goldmark/extension,github.com/yuin/goldmark/extension/ast,github.com/yuin/goldmark/parser,github.com/yuin/goldmark/renderer,github.com/yuin/goldmark/renderer/html,github.com/yuin/goldmark/text,github.com/yuin/goldmark/util ./...

View file

@ -2,13 +2,15 @@ goldmark
========================================== ==========================================
[![https://pkg.go.dev/github.com/yuin/goldmark](https://pkg.go.dev/badge/github.com/yuin/goldmark.svg)](https://pkg.go.dev/github.com/yuin/goldmark) [![https://pkg.go.dev/github.com/yuin/goldmark](https://pkg.go.dev/badge/github.com/yuin/goldmark.svg)](https://pkg.go.dev/github.com/yuin/goldmark)
[![https://github.com/yuin/goldmark/actions?query=workflow:test](https://github.com/yuin/goldmark/workflows/test/badge.svg?branch=master&event=push)](https://github.com/yuin/goldmark/actions?query=workflow:test) [![https://github.com/yuin/goldmark/actions?query=workflow:test](https://github.com/yuin/goldmark/actions/workflows/test.yaml/badge.svg?branch=master&event=push)](https://github.com/yuin/goldmark/actions?query=workflow:test)
[![https://coveralls.io/github/yuin/goldmark](https://coveralls.io/repos/github/yuin/goldmark/badge.svg?branch=master)](https://coveralls.io/github/yuin/goldmark) [![https://coveralls.io/github/yuin/goldmark](https://coveralls.io/repos/github/yuin/goldmark/badge.svg?branch=master)](https://coveralls.io/github/yuin/goldmark)
[![https://goreportcard.com/report/github.com/yuin/goldmark](https://goreportcard.com/badge/github.com/yuin/goldmark)](https://goreportcard.com/report/github.com/yuin/goldmark) [![https://goreportcard.com/report/github.com/yuin/goldmark](https://goreportcard.com/badge/github.com/yuin/goldmark)](https://goreportcard.com/report/github.com/yuin/goldmark)
> A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured. > A Markdown parser written in Go. Easy to extend, standards-compliant, well-structured.
goldmark is compliant with CommonMark 0.30. goldmark is compliant with CommonMark 0.31.2.
- [goldmark playground](https://yuin.github.io/goldmark/playground/) : Try goldmark online. This playground is built with WASM(5-10MB).
Motivation Motivation
---------------------- ----------------------
@ -260,7 +262,7 @@ You can override autolinking patterns via options.
| Functional option | Type | Description | | Functional option | Type | Description |
| ----------------- | ---- | ----------- | | ----------------- | ---- | ----------- |
| `extension.WithLinkifyAllowedProtocols` | `[][]byte` | List of allowed protocols such as `[][]byte{ []byte("http:") }` | | `extension.WithLinkifyAllowedProtocols` | `[][]byte \| []string` | List of allowed protocols such as `[]string{ "http:" }` |
| `extension.WithLinkifyURLRegexp` | `*regexp.Regexp` | Regexp that defines URLs, including protocols | | `extension.WithLinkifyURLRegexp` | `*regexp.Regexp` | Regexp that defines URLs, including protocols |
| `extension.WithLinkifyWWWRegexp` | `*regexp.Regexp` | Regexp that defines URL starting with `www.`. This pattern corresponds to [the extended www autolink](https://github.github.com/gfm/#extended-www-autolink) | | `extension.WithLinkifyWWWRegexp` | `*regexp.Regexp` | Regexp that defines URL starting with `www.`. This pattern corresponds to [the extended www autolink](https://github.github.com/gfm/#extended-www-autolink) |
| `extension.WithLinkifyEmailRegexp` | `*regexp.Regexp` | Regexp that defines email addresses` | | `extension.WithLinkifyEmailRegexp` | `*regexp.Regexp` | Regexp that defines email addresses` |
@ -277,12 +279,12 @@ markdown := goldmark.New(
), ),
goldmark.WithExtensions( goldmark.WithExtensions(
extension.NewLinkify( extension.NewLinkify(
extension.WithLinkifyAllowedProtocols([][]byte{ extension.WithLinkifyAllowedProtocols([]string{
[]byte("http:"), "http:",
[]byte("https:"), "https:",
}), }),
extension.WithLinkifyURLRegexp( extension.WithLinkifyURLRegexp(
xurls.Strict, xurls.Strict(),
), ),
), ),
), ),
@ -297,13 +299,13 @@ This extension has some options:
| Functional option | Type | Description | | Functional option | Type | Description |
| ----------------- | ---- | ----------- | | ----------------- | ---- | ----------- |
| `extension.WithFootnoteIDPrefix` | `[]byte` | a prefix for the id attributes.| | `extension.WithFootnoteIDPrefix` | `[]byte \| string` | a prefix for the id attributes.|
| `extension.WithFootnoteIDPrefixFunction` | `func(gast.Node) []byte` | a function that determines the id attribute for given Node.| | `extension.WithFootnoteIDPrefixFunction` | `func(gast.Node) []byte` | a function that determines the id attribute for given Node.|
| `extension.WithFootnoteLinkTitle` | `[]byte` | an optional title attribute for footnote links.| | `extension.WithFootnoteLinkTitle` | `[]byte \| string` | an optional title attribute for footnote links.|
| `extension.WithFootnoteBacklinkTitle` | `[]byte` | an optional title attribute for footnote backlinks. | | `extension.WithFootnoteBacklinkTitle` | `[]byte \| string` | an optional title attribute for footnote backlinks. |
| `extension.WithFootnoteLinkClass` | `[]byte` | a class for footnote links. This defaults to `footnote-ref`. | | `extension.WithFootnoteLinkClass` | `[]byte \| string` | a class for footnote links. This defaults to `footnote-ref`. |
| `extension.WithFootnoteBacklinkClass` | `[]byte` | a class for footnote backlinks. This defaults to `footnote-backref`. | | `extension.WithFootnoteBacklinkClass` | `[]byte \| string` | a class for footnote backlinks. This defaults to `footnote-backref`. |
| `extension.WithFootnoteBacklinkHTML` | `[]byte` | a class for footnote backlinks. This defaults to `↩︎`. | | `extension.WithFootnoteBacklinkHTML` | `[]byte \| string` | a class for footnote backlinks. This defaults to `↩︎`. |
Some options can have special substitutions. Occurrences of “^^” in the string will be replaced by the corresponding footnote number in the HTML output. Occurrences of “%%” will be replaced by a number for the reference (footnotes can have multiple references). Some options can have special substitutions. Occurrences of “^^” in the string will be replaced by the corresponding footnote number in the HTML output. Occurrences of “%%” will be replaced by a number for the reference (footnotes can have multiple references).
@ -319,7 +321,7 @@ for _, path := range files {
markdown := goldmark.New( markdown := goldmark.New(
goldmark.WithExtensions( goldmark.WithExtensions(
NewFootnote( NewFootnote(
WithFootnoteIDPrefix([]byte(path)), WithFootnoteIDPrefix(path),
), ),
), ),
) )
@ -379,9 +381,47 @@ This extension provides additional options for CJK users.
| Functional option | Type | Description | | Functional option | Type | Description |
| ----------------- | ---- | ----------- | | ----------------- | ---- | ----------- |
| `extension.WithEastAsianLineBreaks` | `-` | Soft line breaks are rendered as a newline. Some asian users will see it as an unnecessary space. With this option, soft line breaks between east asian wide characters will be ignored. | | `extension.WithEastAsianLineBreaks` | `...extension.EastAsianLineBreaksStyle` | Soft line breaks are rendered as a newline. Some asian users will see it as an unnecessary space. With this option, soft line breaks between east asian wide characters will be ignored. This defaults to `EastAsianLineBreaksStyleSimple`. |
| `extension.WithEscapedSpace` | `-` | Without spaces around an emphasis started with east asian punctuations, it is not interpreted as an emphasis(as defined in CommonMark spec). With this option, you can avoid this inconvenient behavior by putting 'not rendered' spaces around an emphasis like `太郎は\ **「こんにちわ」**\ といった`. | | `extension.WithEscapedSpace` | `-` | Without spaces around an emphasis started with east asian punctuations, it is not interpreted as an emphasis(as defined in CommonMark spec). With this option, you can avoid this inconvenient behavior by putting 'not rendered' spaces around an emphasis like `太郎は\ **「こんにちわ」**\ といった`. |
#### Styles of Line Breaking
| Style | Description |
| ----- | ----------- |
| `EastAsianLineBreaksStyleSimple` | Soft line breaks are ignored if both sides of the break are east asian wide character. This behavior is the same as [`east_asian_line_breaks`](https://pandoc.org/MANUAL.html#extension-east_asian_line_breaks) in Pandoc. |
| `EastAsianLineBreaksCSS3Draft` | This option implements CSS text level3 [Segment Break Transformation Rules](https://drafts.csswg.org/css-text-3/#line-break-transform) with [some enhancements](https://github.com/w3c/csswg-drafts/issues/5086). |
#### Example of `EastAsianLineBreaksStyleSimple`
Input Markdown:
```md
私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。
```
Output:
```html
<p>私はプログラマーです。東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。</p>
```
#### Example of `EastAsianLineBreaksCSS3Draft`
Input Markdown:
```md
私はプログラマーです。
東京の会社に勤めています。
GoでWebアプリケーションを開発しています。
```
Output:
```html
<p>私はプログラマーです。東京の会社に勤めています。GoでWebアプリケーションを開発しています。</p>
```
Security Security
-------------------- --------------------
@ -429,6 +469,7 @@ As you can see, goldmark's performance is on par with cmark's.
Extensions Extensions
-------------------- --------------------
### List of extensions
- [goldmark-meta](https://github.com/yuin/goldmark-meta): A YAML metadata - [goldmark-meta](https://github.com/yuin/goldmark-meta): A YAML metadata
extension for the goldmark Markdown parser. extension for the goldmark Markdown parser.
@ -452,6 +493,14 @@ Extensions
- [goldmark-d2](https://github.com/FurqanSoftware/goldmark-d2): Adds support for [D2](https://d2lang.com/) diagrams. - [goldmark-d2](https://github.com/FurqanSoftware/goldmark-d2): Adds support for [D2](https://d2lang.com/) diagrams.
- [goldmark-katex](https://github.com/FurqanSoftware/goldmark-katex): Adds support for [KaTeX](https://katex.org/) math and equations. - [goldmark-katex](https://github.com/FurqanSoftware/goldmark-katex): Adds support for [KaTeX](https://katex.org/) math and equations.
- [goldmark-img64](https://github.com/tenkoh/goldmark-img64): Adds support for embedding images into the document as DataURL (base64 encoded). - [goldmark-img64](https://github.com/tenkoh/goldmark-img64): Adds support for embedding images into the document as DataURL (base64 encoded).
- [goldmark-enclave](https://github.com/quailyquaily/goldmark-enclave): Adds support for embedding youtube/bilibili video, X's [oembed X](https://publish.x.com/), [tradingview chart](https://www.tradingview.com/widget/)'s chart, [quaily widget](https://quaily.com), [spotify embeds](https://developer.spotify.com/documentation/embeds), [dify embed](https://dify.ai/) and html audio into the document.
- [goldmark-wiki-table](https://github.com/movsb/goldmark-wiki-table): Adds support for embedding Wiki Tables.
- [goldmark-tgmd](https://github.com/Mad-Pixels/goldmark-tgmd): A Telegram markdown renderer that can be passed to `goldmark.WithRenderer()`.
### Loading extensions at runtime
[goldmark-dynamic](https://github.com/yuin/goldmark-dynamic) allows you to write a goldmark extension in Lua and load it at runtime without re-compilation.
Please refer to [goldmark-dynamic](https://github.com/yuin/goldmark-dynamic) for details.
goldmark internal(for extension developers) goldmark internal(for extension developers)

View file

@ -6,7 +6,7 @@ require (
github.com/88250/lute v1.7.5 github.com/88250/lute v1.7.5
github.com/gomarkdown/markdown v0.0.0-20230322041520-c84983bdbf2a github.com/gomarkdown/markdown v0.0.0-20230322041520-c84983bdbf2a
github.com/russross/blackfriday/v2 v2.1.0 github.com/russross/blackfriday/v2 v2.1.0
github.com/yuin/goldmark v1.2.1 github.com/yuin/goldmark v0.0.0
gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a gitlab.com/golang-commonmark/markdown v0.0.0-20211110145824-bf3e522c626a
) )
@ -22,3 +22,4 @@ require (
) )
replace gopkg.in/russross/blackfriday.v2 v2.0.1 => github.com/russross/blackfriday/v2 v2.0.1 replace gopkg.in/russross/blackfriday.v2 v2.0.1 => github.com/russross/blackfriday/v2 v2.0.1
replace github.com/yuin/goldmark v0.0.0 => ../../

View file

@ -385,7 +385,8 @@ a* b c d *e*
//- - - - - - - - -// //- - - - - - - - -//
x x
//- - - - - - - - -// //- - - - - - - - -//
<pre><code> x</code></pre> <pre><code> x
</code></pre>
//= = = = = = = = = = = = = = = = = = = = = = = =// //= = = = = = = = = = = = = = = = = = = = = = = =//
26: NUL bytes must be replaced with U+FFFD 26: NUL bytes must be replaced with U+FFFD
@ -771,3 +772,74 @@ a <!-- b -->
<p>&lt;img src=./.assets/logo.svg</p> <p>&lt;img src=./.assets/logo.svg</p>
<p>/&gt;</p> <p>/&gt;</p>
//= = = = = = = = = = = = = = = = = = = = = = = =// //= = = = = = = = = = = = = = = = = = = = = = = =//
61: Image alt with a new line
//- - - - - - - - -//
![alt
text](logo.png)
//- - - - - - - - -//
<p><img src="logo.png" alt="alt
text" /></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
62: Image alt with an escaped character
//- - - - - - - - -//
![\`alt](https://example.com/img.png)
//- - - - - - - - -//
<p><img src="https://example.com/img.png" alt="`alt" /></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
63: Emphasis in link label
//- - - - - - - - -//
[*[a]*](b)
//- - - - - - - - -//
<p><a href="b"><em>[a]</em></a></p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
64: Nested list under an empty list item
//- - - - - - - - -//
-
- foo
//- - - - - - - - -//
<ul>
<li>
<ul>
<li>foo</li>
</ul>
</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
65: Nested fenced code block with tab
//- - - - - - - - -//
> ```
> 0
> ```
//- - - - - - - - -//
<blockquote>
<pre><code> 0
</code></pre>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//
66: EOF should be rendered as a newline with an unclosed block(w/ TAB)
//- - - - - - - - -//
> ```
> 0
//- - - - - - - - -//
<blockquote>
<pre><code> 0
</code></pre>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//
67: EOF should be rendered as a newline with an unclosed block
//- - - - - - - - -//
> ```
> 0
//- - - - - - - - -//
<blockquote>
<pre><code> 0
</code></pre>
</blockquote>
//= = = = = = = = = = = = = = = = = = = = = = = =//

File diff suppressed because it is too large Load diff

View file

@ -39,7 +39,7 @@ func NewNodeKind(name string) NodeKind {
return kindMax return kindMax
} }
// An Attribute is an attribute of the Node // An Attribute is an attribute of the Node.
type Attribute struct { type Attribute struct {
Name []byte Name []byte
Value interface{} Value interface{}
@ -123,6 +123,12 @@ type Node interface {
Dump(source []byte, level int) Dump(source []byte, level int)
// Text returns text values of this node. // Text returns text values of this node.
// This method is valid only for some inline nodes.
// If this node is a block node, Text returns a text value as reasonable as possible.
// Notice that there are no 'correct' text values for the block nodes.
// Result for the block nodes may be different from your expectation.
//
// Deprecated: Use other properties of the node to get the text value(i.e. Pragraph.Lines, Text.Value).
Text(source []byte) []byte Text(source []byte) []byte
// HasBlankPreviousLines returns true if the row before this node is blank, // HasBlankPreviousLines returns true if the row before this node is blank,
@ -248,7 +254,7 @@ func (n *BaseNode) RemoveChildren(self Node) {
n.childCount = 0 n.childCount = 0
} }
// SortChildren implements Node.SortChildren // SortChildren implements Node.SortChildren.
func (n *BaseNode) SortChildren(comparator func(n1, n2 Node) int) { func (n *BaseNode) SortChildren(comparator func(n1, n2 Node) int) {
var sorted Node var sorted Node
current := n.firstChild current := n.firstChild
@ -358,7 +364,7 @@ func (n *BaseNode) InsertBefore(self, v1, insertee Node) {
} }
} }
// OwnerDocument implements Node.OwnerDocument // OwnerDocument implements Node.OwnerDocument.
func (n *BaseNode) OwnerDocument() *Document { func (n *BaseNode) OwnerDocument() *Document {
d := n.Parent() d := n.Parent()
for { for {
@ -375,10 +381,17 @@ func (n *BaseNode) OwnerDocument() *Document {
} }
// Text implements Node.Text . // Text implements Node.Text .
//
// Deprecated: Use other properties of the node to get the text value(i.e. Pragraph.Lines, Text.Value).
func (n *BaseNode) Text(source []byte) []byte { func (n *BaseNode) Text(source []byte) []byte {
var buf bytes.Buffer var buf bytes.Buffer
for c := n.firstChild; c != nil; c = c.NextSibling() { for c := n.firstChild; c != nil; c = c.NextSibling() {
buf.Write(c.Text(source)) buf.Write(c.Text(source))
if sb, ok := c.(interface {
SoftLineBreak() bool
}); ok && sb.SoftLineBreak() {
buf.WriteByte('\n')
}
} }
return buf.Bytes() return buf.Bytes()
} }
@ -399,7 +412,7 @@ func (n *BaseNode) SetAttribute(name []byte, value interface{}) {
n.attributes = append(n.attributes, Attribute{name, value}) n.attributes = append(n.attributes, Attribute{name, value})
} }
// SetAttributeString implements Node.SetAttributeString // SetAttributeString implements Node.SetAttributeString.
func (n *BaseNode) SetAttributeString(name string, value interface{}) { func (n *BaseNode) SetAttributeString(name string, value interface{}) {
n.SetAttribute(util.StringToReadOnlyBytes(name), value) n.SetAttribute(util.StringToReadOnlyBytes(name), value)
} }
@ -422,12 +435,12 @@ func (n *BaseNode) AttributeString(s string) (interface{}, bool) {
return n.Attribute(util.StringToReadOnlyBytes(s)) return n.Attribute(util.StringToReadOnlyBytes(s))
} }
// Attributes implements Node.Attributes // Attributes implements Node.Attributes.
func (n *BaseNode) Attributes() []Attribute { func (n *BaseNode) Attributes() []Attribute {
return n.attributes return n.attributes
} }
// RemoveAttributes implements Node.RemoveAttributes // RemoveAttributes implements Node.RemoveAttributes.
func (n *BaseNode) RemoveAttributes() { func (n *BaseNode) RemoveAttributes() {
n.attributes = nil n.attributes = nil
} }

View file

@ -5,21 +5,6 @@ import (
"testing" "testing"
) )
func TestRemoveChildren(t *testing.T) {
root := NewDocument()
node1 := NewDocument()
node2 := NewDocument()
root.AppendChild(root, node1)
root.AppendChild(root, node2)
root.RemoveChildren(root)
t.Logf("%+v", node2.PreviousSibling())
}
func TestWalk(t *testing.T) { func TestWalk(t *testing.T) {
tests := []struct { tests := []struct {
name string name string

View file

@ -14,12 +14,12 @@ type BaseBlock struct {
lines *textm.Segments lines *textm.Segments
} }
// Type implements Node.Type // Type implements Node.Type.
func (b *BaseBlock) Type() NodeType { func (b *BaseBlock) Type() NodeType {
return TypeBlock return TypeBlock
} }
// IsRaw implements Node.IsRaw // IsRaw implements Node.IsRaw.
func (b *BaseBlock) IsRaw() bool { func (b *BaseBlock) IsRaw() bool {
return false return false
} }
@ -34,7 +34,7 @@ func (b *BaseBlock) SetBlankPreviousLines(v bool) {
b.blankPreviousLines = v b.blankPreviousLines = v
} }
// Lines implements Node.Lines // Lines implements Node.Lines.
func (b *BaseBlock) Lines() *textm.Segments { func (b *BaseBlock) Lines() *textm.Segments {
if b.lines == nil { if b.lines == nil {
b.lines = textm.NewSegments() b.lines = textm.NewSegments()
@ -42,7 +42,7 @@ func (b *BaseBlock) Lines() *textm.Segments {
return b.lines return b.lines
} }
// SetLines implements Node.SetLines // SetLines implements Node.SetLines.
func (b *BaseBlock) SetLines(v *textm.Segments) { func (b *BaseBlock) SetLines(v *textm.Segments) {
b.lines = v b.lines = v
} }
@ -72,7 +72,7 @@ func (n *Document) Kind() NodeKind {
return KindDocument return KindDocument
} }
// OwnerDocument implements Node.OwnerDocument // OwnerDocument implements Node.OwnerDocument.
func (n *Document) OwnerDocument() *Document { func (n *Document) OwnerDocument() *Document {
return n return n
} }
@ -130,6 +130,13 @@ func (n *TextBlock) Kind() NodeKind {
return KindTextBlock return KindTextBlock
} }
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. TextBlock.Lines).
func (n *TextBlock) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewTextBlock returns a new TextBlock node. // NewTextBlock returns a new TextBlock node.
func NewTextBlock() *TextBlock { func NewTextBlock() *TextBlock {
return &TextBlock{ return &TextBlock{
@ -155,6 +162,13 @@ func (n *Paragraph) Kind() NodeKind {
return KindParagraph return KindParagraph
} }
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. Paragraph.Lines).
func (n *Paragraph) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewParagraph returns a new Paragraph node. // NewParagraph returns a new Paragraph node.
func NewParagraph() *Paragraph { func NewParagraph() *Paragraph {
return &Paragraph{ return &Paragraph{
@ -249,6 +263,13 @@ func (n *CodeBlock) Kind() NodeKind {
return KindCodeBlock return KindCodeBlock
} }
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. CodeBlock.Lines).
func (n *CodeBlock) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewCodeBlock returns a new CodeBlock node. // NewCodeBlock returns a new CodeBlock node.
func NewCodeBlock() *CodeBlock { func NewCodeBlock() *CodeBlock {
return &CodeBlock{ return &CodeBlock{
@ -304,6 +325,13 @@ func (n *FencedCodeBlock) Kind() NodeKind {
return KindFencedCodeBlock return KindFencedCodeBlock
} }
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. FencedCodeBlock.Lines).
func (n *FencedCodeBlock) Text(source []byte) []byte {
return n.Lines().Value(source)
}
// NewFencedCodeBlock return a new FencedCodeBlock node. // NewFencedCodeBlock return a new FencedCodeBlock node.
func NewFencedCodeBlock(info *Text) *FencedCodeBlock { func NewFencedCodeBlock(info *Text) *FencedCodeBlock {
return &FencedCodeBlock{ return &FencedCodeBlock{
@ -431,19 +459,19 @@ func NewListItem(offset int) *ListItem {
type HTMLBlockType int type HTMLBlockType int
const ( const (
// HTMLBlockType1 represents type 1 html blocks // HTMLBlockType1 represents type 1 html blocks.
HTMLBlockType1 HTMLBlockType = iota + 1 HTMLBlockType1 HTMLBlockType = iota + 1
// HTMLBlockType2 represents type 2 html blocks // HTMLBlockType2 represents type 2 html blocks.
HTMLBlockType2 HTMLBlockType2
// HTMLBlockType3 represents type 3 html blocks // HTMLBlockType3 represents type 3 html blocks.
HTMLBlockType3 HTMLBlockType3
// HTMLBlockType4 represents type 4 html blocks // HTMLBlockType4 represents type 4 html blocks.
HTMLBlockType4 HTMLBlockType4
// HTMLBlockType5 represents type 5 html blocks // HTMLBlockType5 represents type 5 html blocks.
HTMLBlockType5 HTMLBlockType5
// HTMLBlockType6 represents type 6 html blocks // HTMLBlockType6 represents type 6 html blocks.
HTMLBlockType6 HTMLBlockType6
// HTMLBlockType7 represents type 7 html blocks // HTMLBlockType7 represents type 7 html blocks.
HTMLBlockType7 HTMLBlockType7
) )
@ -498,6 +526,17 @@ func (n *HTMLBlock) Kind() NodeKind {
return KindHTMLBlock return KindHTMLBlock
} }
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. HTMLBlock.Lines).
func (n *HTMLBlock) Text(source []byte) []byte {
ret := n.Lines().Value(source)
if n.HasClosure() {
ret = append(ret, n.ClosureLine.Value(source)...)
}
return ret
}
// NewHTMLBlock returns a new HTMLBlock node. // NewHTMLBlock returns a new HTMLBlock node.
func NewHTMLBlock(typ HTMLBlockType) *HTMLBlock { func NewHTMLBlock(typ HTMLBlockType) *HTMLBlock {
return &HTMLBlock{ return &HTMLBlock{

View file

@ -13,12 +13,12 @@ type BaseInline struct {
BaseNode BaseNode
} }
// Type implements Node.Type // Type implements Node.Type.
func (b *BaseInline) Type() NodeType { func (b *BaseInline) Type() NodeType {
return TypeInline return TypeInline
} }
// IsRaw implements Node.IsRaw // IsRaw implements Node.IsRaw.
func (b *BaseInline) IsRaw() bool { func (b *BaseInline) IsRaw() bool {
return false return false
} }
@ -33,12 +33,12 @@ func (b *BaseInline) SetBlankPreviousLines(v bool) {
panic("can not call with inline nodes.") panic("can not call with inline nodes.")
} }
// Lines implements Node.Lines // Lines implements Node.Lines.
func (b *BaseInline) Lines() *textm.Segments { func (b *BaseInline) Lines() *textm.Segments {
panic("can not call with inline nodes.") panic("can not call with inline nodes.")
} }
// SetLines implements Node.SetLines // SetLines implements Node.SetLines.
func (b *BaseInline) SetLines(v *textm.Segments) { func (b *BaseInline) SetLines(v *textm.Segments) {
panic("can not call with inline nodes.") panic("can not call with inline nodes.")
} }
@ -132,7 +132,8 @@ func (n *Text) Merge(node Node, source []byte) bool {
if !ok { if !ok {
return false return false
} }
if n.Segment.Stop != t.Segment.Start || t.Segment.Padding != 0 || source[n.Segment.Stop-1] == '\n' || t.IsRaw() != n.IsRaw() { if n.Segment.Stop != t.Segment.Start || t.Segment.Padding != 0 ||
source[n.Segment.Stop-1] == '\n' || t.IsRaw() != n.IsRaw() {
return false return false
} }
n.Segment.Stop = t.Segment.Stop n.Segment.Stop = t.Segment.Stop
@ -142,17 +143,25 @@ func (n *Text) Merge(node Node, source []byte) bool {
} }
// Text implements Node.Text. // Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. Text.Value).
func (n *Text) Text(source []byte) []byte { func (n *Text) Text(source []byte) []byte {
return n.Segment.Value(source) return n.Segment.Value(source)
} }
// Value returns a value of this node.
// SoftLineBreaks are not included in the returned value.
func (n *Text) Value(source []byte) []byte {
return n.Segment.Value(source)
}
// Dump implements Node.Dump. // Dump implements Node.Dump.
func (n *Text) Dump(source []byte, level int) { func (n *Text) Dump(source []byte, level int) {
fs := textFlagsString(n.flags) fs := textFlagsString(n.flags)
if len(fs) != 0 { if len(fs) != 0 {
fs = "(" + fs + ")" fs = "(" + fs + ")"
} }
fmt.Printf("%sText%s: \"%s\"\n", strings.Repeat(" ", level), fs, strings.TrimRight(string(n.Text(source)), "\n")) fmt.Printf("%sText%s: \"%s\"\n", strings.Repeat(" ", level), fs, strings.TrimRight(string(n.Value(source)), "\n"))
} }
// KindText is a NodeKind of the Text node. // KindText is a NodeKind of the Text node.
@ -214,7 +223,7 @@ func MergeOrReplaceTextSegment(parent Node, n Node, s textm.Segment) {
} }
} }
// A String struct is a textual content that has a concrete value // A String struct is a textual content that has a concrete value.
type String struct { type String struct {
BaseInline BaseInline
@ -257,6 +266,8 @@ func (n *String) SetCode(v bool) {
} }
// Text implements Node.Text. // Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. String.Value).
func (n *String) Text(source []byte) []byte { func (n *String) Text(source []byte) []byte {
return n.Value return n.Value
} }
@ -305,7 +316,7 @@ func (n *CodeSpan) IsBlank(source []byte) bool {
return true return true
} }
// Dump implements Node.Dump // Dump implements Node.Dump.
func (n *CodeSpan) Dump(source []byte, level int) { func (n *CodeSpan) Dump(source []byte, level int) {
DumpHelper(n, source, level, nil, nil) DumpHelper(n, source, level, nil, nil)
} }
@ -467,7 +478,7 @@ type AutoLink struct {
// Inline implements Inline.Inline. // Inline implements Inline.Inline.
func (n *AutoLink) Inline() {} func (n *AutoLink) Inline() {}
// Dump implements Node.Dump // Dump implements Node.Dump.
func (n *AutoLink) Dump(source []byte, level int) { func (n *AutoLink) Dump(source []byte, level int) {
segment := n.value.Segment segment := n.value.Segment
m := map[string]string{ m := map[string]string{
@ -491,15 +502,22 @@ func (n *AutoLink) URL(source []byte) []byte {
ret := make([]byte, 0, len(n.Protocol)+s.Len()+3) ret := make([]byte, 0, len(n.Protocol)+s.Len()+3)
ret = append(ret, n.Protocol...) ret = append(ret, n.Protocol...)
ret = append(ret, ':', '/', '/') ret = append(ret, ':', '/', '/')
ret = append(ret, n.value.Text(source)...) ret = append(ret, n.value.Value(source)...)
return ret return ret
} }
return n.value.Text(source) return n.value.Value(source)
} }
// Label returns a label of this node. // Label returns a label of this node.
func (n *AutoLink) Label(source []byte) []byte { func (n *AutoLink) Label(source []byte) []byte {
return n.value.Text(source) return n.value.Value(source)
}
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. AutoLink.Label).
func (n *AutoLink) Text(source []byte) []byte {
return n.value.Value(source)
} }
// NewAutoLink returns a new AutoLink node. // NewAutoLink returns a new AutoLink node.
@ -540,6 +558,13 @@ func (n *RawHTML) Kind() NodeKind {
return KindRawHTML return KindRawHTML
} }
// Text implements Node.Text.
//
// Deprecated: Use other properties of the node to get the text value(i.e. RawHTML.Segments).
func (n *RawHTML) Text(source []byte) []byte {
return n.Segments.Value(source)
}
// NewRawHTML returns a new RawHTML node. // NewRawHTML returns a new RawHTML node.
func NewRawHTML() *RawHTML { func NewRawHTML() *RawHTML {
return &RawHTML{ return &RawHTML{

204
ast_test.go Normal file
View file

@ -0,0 +1,204 @@
package goldmark_test
import (
"bytes"
"testing"
. "github.com/yuin/goldmark"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/text"
)
func TestASTBlockNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
T2 string
C bool
}{
{
Name: "AtxHeading",
Source: `# l1
a
# l2`,
T1: `l1`,
T2: `l2`,
},
{
Name: "SetextHeading",
Source: `l1
l2
===============
a
l3
l4
==============`,
T1: `l1
l2`,
T2: `l3
l4`,
},
{
Name: "CodeBlock",
Source: ` l1
l2
a
l3
l4`,
T1: `l1
l2
`,
T2: `l3
l4
`,
},
{
Name: "FencedCodeBlock",
Source: "```" + `
l1
l2
` + "```" + `
a
` + "```" + `
l3
l4`,
T1: `l1
l2
`,
T2: `l3
l4
`,
},
{
Name: "Blockquote",
Source: `> l1
> l2
a
> l3
> l4`,
T1: `l1
l2`,
T2: `l3
l4`,
},
{
Name: "List",
Source: `- l1
l2
a
- l3
l4`,
T1: `l1
l2`,
T2: `l3
l4`,
C: true,
},
{
Name: "HTMLBlock",
Source: `<div>
l1
l2
</div>
a
<div>
l3
l4`,
T1: `<div>
l1
l2
</div>
`,
T2: `<div>
l3
l4`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := New()
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild()
c2 := c1.NextSibling().NextSibling()
if cs.C {
c1 = c1.FirstChild()
c2 = c2.FirstChild()
}
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch: %s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
if !bytes.Equal(c2.Text(s), []byte(cs.T2)) { // nolint: staticcheck
t.Errorf("%s(EOF) unmatch: %s", cs.Name, testutil.DiffPretty(c2.Text(s), []byte(cs.T2))) // nolint: staticcheck
}
})
}
}
func TestASTInlineNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
}{
{
Name: "CodeSpan",
Source: "`c1`",
T1: `c1`,
},
{
Name: "Emphasis",
Source: `*c1 **c2***`,
T1: `c1 c2`,
},
{
Name: "Link",
Source: `[label](url)`,
T1: `label`,
},
{
Name: "AutoLink",
Source: `<http://url>`,
T1: `http://url`,
},
{
Name: "RawHTML",
Source: `<span>c1</span>`,
T1: `<span>`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := New()
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild().FirstChild()
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch:\n%s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
})
}
}

View file

@ -2,7 +2,7 @@ package goldmark_test
import ( import (
"encoding/json" "encoding/json"
"io/ioutil" "os"
"testing" "testing"
. "github.com/yuin/goldmark" . "github.com/yuin/goldmark"
@ -20,7 +20,7 @@ type commonmarkSpecTestCase struct {
} }
func TestSpec(t *testing.T) { func TestSpec(t *testing.T) {
bs, err := ioutil.ReadFile("_test/spec.json") bs, err := os.ReadFile("_test/spec.json")
if err != nil { if err != nil {
panic(err) panic(err)
} }

View file

@ -150,7 +150,8 @@ on two lines.</p>
//- - - - - - - - -// //- - - - - - - - -//
<dl> <dl>
<dt>0</dt> <dt>0</dt>
<dd><pre><code> 0</code></pre> <dd><pre><code> 0
</code></pre>
</dd> </dd>
</dl> </dl>
//= = = = = = = = = = = = = = = = = = = = = = = =// //= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -5,8 +5,6 @@
<p><del>Hi</del> Hello, world!</p> <p><del>Hi</del> Hello, world!</p>
//= = = = = = = = = = = = = = = = = = = = = = = =// //= = = = = = = = = = = = = = = = = = = = = = = =//
2 2
//- - - - - - - - -// //- - - - - - - - -//
This ~~has a This ~~has a
@ -16,3 +14,26 @@ new paragraph~~.
<p>This ~~has a</p> <p>This ~~has a</p>
<p>new paragraph~~.</p> <p>new paragraph~~.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =// //= = = = = = = = = = = = = = = = = = = = = = = =//
3
//- - - - - - - - -//
~Hi~ Hello, world!
//- - - - - - - - -//
<p><del>Hi</del> Hello, world!</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
4: Three or more tildes do not create a strikethrough
//- - - - - - - - -//
This will ~~~not~~~ strike.
//- - - - - - - - -//
<p>This will ~~~not~~~ strike.</p>
//= = = = = = = = = = = = = = = = = = = = = = = =//
5: Leading three or more tildes do not create a strikethrough, create a code block
//- - - - - - - - -//
~~~Hi~~~ Hello, world!
//- - - - - - - - -//
<pre><code class="language-Hi~~~"></code></pre>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -28,3 +28,24 @@
<li><input disabled="" type="checkbox"> bim</li> <li><input disabled="" type="checkbox"> bim</li>
</ul> </ul>
//= = = = = = = = = = = = = = = = = = = = = = = =// //= = = = = = = = = = = = = = = = = = = = = = = =//
3
//- - - - - - - - -//
- test[x]=[x]
//- - - - - - - - -//
<ul>
<li>test[x]=[x]</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//
4
//- - - - - - - - -//
+ [x] [x]
//- - - - - - - - -//
<ul>
<li><input checked="" disabled="" type="checkbox"> [x]</li>
</ul>
//= = = = = = = = = = = = = = = = = = = = = = = =//

View file

@ -88,7 +88,7 @@ type Footnote struct {
func (n *Footnote) Dump(source []byte, level int) { func (n *Footnote) Dump(source []byte, level int) {
m := map[string]string{} m := map[string]string{}
m["Index"] = fmt.Sprintf("%v", n.Index) m["Index"] = fmt.Sprintf("%v", n.Index)
m["Ref"] = fmt.Sprintf("%s", n.Ref) m["Ref"] = string(n.Ref)
gast.DumpHelper(n, source, level, m, nil) gast.DumpHelper(n, source, level, m, nil)
} }

View file

@ -2,8 +2,9 @@ package ast
import ( import (
"fmt" "fmt"
gast "github.com/yuin/goldmark/ast"
"strings" "strings"
gast "github.com/yuin/goldmark/ast"
) )
// Alignment is a text alignment of table cells. // Alignment is a text alignment of table cells.
@ -45,7 +46,7 @@ type Table struct {
Alignments []Alignment Alignments []Alignment
} }
// Dump implements Node.Dump // Dump implements Node.Dump.
func (n *Table) Dump(source []byte, level int) { func (n *Table) Dump(source []byte, level int) {
gast.DumpHelper(n, source, level, nil, func(level int) { gast.DumpHelper(n, source, level, nil, func(level int) {
indent := strings.Repeat(" ", level) indent := strings.Repeat(" ", level)

123
extension/ast_test.go Normal file
View file

@ -0,0 +1,123 @@
package extension
import (
"bytes"
"testing"
"github.com/yuin/goldmark"
"github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/testutil"
"github.com/yuin/goldmark/text"
)
func TestASTBlockNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
T2 string
C bool
}{
{
Name: "DefinitionList",
Source: `c1
: c2
c3
a
c4
: c5
c6`,
T1: `c1c2
c3`,
T2: `c4c5
c6`,
},
{
Name: "Table",
Source: `| h1 | h2 |
| -- | -- |
| c1 | c2 |
a
| h3 | h4 |
| -- | -- |
| c3 | c4 |`,
T1: `h1h2c1c2`,
T2: `h3h4c3c4`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
DefinitionList,
Table,
),
)
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild()
c2 := c1.NextSibling().NextSibling()
if cs.C {
c1 = c1.FirstChild()
c2 = c2.FirstChild()
}
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch:\n%s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
if !bytes.Equal(c2.Text(s), []byte(cs.T2)) { // nolint: staticcheck
t.Errorf("%s(EOF) unmatch: %s", cs.Name, testutil.DiffPretty(c2.Text(s), []byte(cs.T2))) // nolint: staticcheck
}
})
}
}
func TestASTInlineNodeText(t *testing.T) {
var cases = []struct {
Name string
Source string
T1 string
}{
{
Name: "Strikethrough",
Source: `~c1 *c2*~`,
T1: `c1 c2`,
},
}
for _, cs := range cases {
t.Run(cs.Name, func(t *testing.T) {
s := []byte(cs.Source)
md := goldmark.New(
goldmark.WithRendererOptions(
html.WithUnsafe(),
),
goldmark.WithExtensions(
Strikethrough,
),
)
n := md.Parser().Parse(text.NewReader(s))
c1 := n.FirstChild().FirstChild()
if !bytes.Equal(c1.Text(s), []byte(cs.T1)) { // nolint: staticcheck
t.Errorf("%s unmatch:\n%s", cs.Name, testutil.DiffPretty(c1.Text(s), []byte(cs.T1))) // nolint: staticcheck
}
})
}
}

View file

@ -9,11 +9,30 @@ import (
// A CJKOption sets options for CJK support mostly for HTML based renderers. // A CJKOption sets options for CJK support mostly for HTML based renderers.
type CJKOption func(*cjk) type CJKOption func(*cjk)
// A EastAsianLineBreaks is a style of east asian line breaks.
type EastAsianLineBreaks int
const (
//EastAsianLineBreaksNone renders line breaks as it is.
EastAsianLineBreaksNone EastAsianLineBreaks = iota
// EastAsianLineBreaksSimple is a style where soft line breaks are ignored
// if both sides of the break are east asian wide characters.
EastAsianLineBreaksSimple
// EastAsianLineBreaksCSS3Draft is a style where soft line breaks are ignored
// even if only one side of the break is an east asian wide character.
EastAsianLineBreaksCSS3Draft
)
// WithEastAsianLineBreaks is a functional option that indicates whether softline breaks // WithEastAsianLineBreaks is a functional option that indicates whether softline breaks
// between east asian wide characters should be ignored. // between east asian wide characters should be ignored.
func WithEastAsianLineBreaks() CJKOption { // style defauts to [EastAsianLineBreaksSimple] .
func WithEastAsianLineBreaks(style ...EastAsianLineBreaks) CJKOption {
return func(c *cjk) { return func(c *cjk) {
c.EastAsianLineBreaks = true if len(style) == 0 {
c.EastAsianLineBreaks = EastAsianLineBreaksSimple
return
}
c.EastAsianLineBreaks = style[0]
} }
} }
@ -25,15 +44,18 @@ func WithEscapedSpace() CJKOption {
} }
type cjk struct { type cjk struct {
EastAsianLineBreaks bool EastAsianLineBreaks EastAsianLineBreaks
EscapedSpace bool EscapedSpace bool
} }
// CJK is a goldmark extension that provides functionalities for CJK languages.
var CJK = NewCJK(WithEastAsianLineBreaks(), WithEscapedSpace()) var CJK = NewCJK(WithEastAsianLineBreaks(), WithEscapedSpace())
// NewCJK returns a new extension with given options. // NewCJK returns a new extension with given options.
func NewCJK(opts ...CJKOption) goldmark.Extender { func NewCJK(opts ...CJKOption) goldmark.Extender {
e := &cjk{} e := &cjk{
EastAsianLineBreaks: EastAsianLineBreaksNone,
}
for _, opt := range opts { for _, opt := range opts {
opt(e) opt(e)
} }
@ -41,9 +63,8 @@ func NewCJK(opts ...CJKOption) goldmark.Extender {
} }
func (e *cjk) Extend(m goldmark.Markdown) { func (e *cjk) Extend(m goldmark.Markdown) {
if e.EastAsianLineBreaks { m.Renderer().AddOptions(html.WithEastAsianLineBreaks(
m.Renderer().AddOptions(html.WithEastAsianLineBreaks()) html.EastAsianLineBreaks(e.EastAsianLineBreaks)))
}
if e.EscapedSpace { if e.EscapedSpace {
m.Renderer().AddOptions(html.WithWriter(html.NewWriter(html.WithEscapedSpace()))) m.Renderer().AddOptions(html.WithWriter(html.NewWriter(html.WithEscapedSpace())))
m.Parser().AddOptions(parser.WithEscapedSpace()) m.Parser().AddOptions(parser.WithEscapedSpace())

View file

@ -177,6 +177,7 @@ func TestEastAsianLineBreaks(t *testing.T) {
t, t,
) )
// Tests with EastAsianLineBreaksStyleSimple
markdown = goldmark.New(goldmark.WithRendererOptions( markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(), html.WithXHTML(),
html.WithUnsafe(), html.WithUnsafe(),
@ -197,4 +198,72 @@ func TestEastAsianLineBreaks(t *testing.T) {
}, },
t, t,
) )
no = 8
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between east asian wide characters or punctuations are ignored",
Markdown: "太郎は\\ **「こんにちわ」**\\ と、\r\n言った\r\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と、言ったんです</p>",
},
t,
)
no = 9
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between an east asian wide character and a western character are ignored",
Markdown: "私はプログラマーです。\n東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。",
Expected: "<p>私はプログラマーです。東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。</p>",
},
t,
)
// Tests with EastAsianLineBreaksCSS3Draft
markdown = goldmark.New(goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewCJK(WithEastAsianLineBreaks(EastAsianLineBreaksCSS3Draft)),
),
)
no = 10
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between a western character and an east asian wide character are ignored",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言ったa\nんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったaんです</p>",
},
t,
)
no = 11
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between an east asian wide character and a western character are ignored",
Markdown: "太郎は\\ **「こんにちわ」**\\ と言った\nbんです",
Expected: "<p>太郎は\\ <strong>「こんにちわ」</strong>\\ と言ったbんです</p>",
},
t,
)
no = 12
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: no,
Description: "Soft line breaks between an east asian wide character and a western character are ignored",
Markdown: "私はプログラマーです。\n東京の会社に勤めています。\nGoでWebアプリケーションを開発しています。",
Expected: "<p>私はプログラマーです。東京の会社に勤めています。GoでWebアプリケーションを開発しています。</p>",
},
t,
)
} }

View file

@ -113,7 +113,8 @@ func (b *definitionDescriptionParser) Trigger() []byte {
return []byte{':'} return []byte{':'}
} }
func (b *definitionDescriptionParser) Open(parent gast.Node, reader text.Reader, pc parser.Context) (gast.Node, parser.State) { func (b *definitionDescriptionParser) Open(
parent gast.Node, reader text.Reader, pc parser.Context) (gast.Node, parser.State) {
line, _ := reader.PeekLine() line, _ := reader.PeekLine()
pos := pc.BlockOffset() pos := pc.BlockOffset()
indent := pc.BlockIndent() indent := pc.BlockIndent()
@ -199,7 +200,8 @@ func (r *DefinitionListHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFunc
// DefinitionListAttributeFilter defines attribute names which dl elements can have. // DefinitionListAttributeFilter defines attribute names which dl elements can have.
var DefinitionListAttributeFilter = html.GlobalAttributeFilter var DefinitionListAttributeFilter = html.GlobalAttributeFilter
func (r *DefinitionListHTMLRenderer) renderDefinitionList(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) { func (r *DefinitionListHTMLRenderer) renderDefinitionList(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
if n.Attributes() != nil { if n.Attributes() != nil {
_, _ = w.WriteString("<dl") _, _ = w.WriteString("<dl")
@ -217,7 +219,8 @@ func (r *DefinitionListHTMLRenderer) renderDefinitionList(w util.BufWriter, sour
// DefinitionTermAttributeFilter defines attribute names which dd elements can have. // DefinitionTermAttributeFilter defines attribute names which dd elements can have.
var DefinitionTermAttributeFilter = html.GlobalAttributeFilter var DefinitionTermAttributeFilter = html.GlobalAttributeFilter
func (r *DefinitionListHTMLRenderer) renderDefinitionTerm(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) { func (r *DefinitionListHTMLRenderer) renderDefinitionTerm(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
if n.Attributes() != nil { if n.Attributes() != nil {
_, _ = w.WriteString("<dt") _, _ = w.WriteString("<dt")
@ -235,7 +238,8 @@ func (r *DefinitionListHTMLRenderer) renderDefinitionTerm(w util.BufWriter, sour
// DefinitionDescriptionAttributeFilter defines attribute names which dd elements can have. // DefinitionDescriptionAttributeFilter defines attribute names which dd elements can have.
var DefinitionDescriptionAttributeFilter = html.GlobalAttributeFilter var DefinitionDescriptionAttributeFilter = html.GlobalAttributeFilter
func (r *DefinitionListHTMLRenderer) renderDefinitionDescription(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) { func (r *DefinitionListHTMLRenderer) renderDefinitionDescription(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
n := node.(*ast.DefinitionDescription) n := node.(*ast.DefinitionDescription)
_, _ = w.WriteString("<dd") _, _ = w.WriteString("<dd")

View file

@ -44,8 +44,8 @@ func (b *footnoteBlockParser) Open(parent gast.Node, reader text.Reader, pc pars
return nil, parser.NoChildren return nil, parser.NoChildren
} }
open := pos + 1 open := pos + 1
closes := 0 var closes int
closure := util.FindClosure(line[pos+1:], '[', ']', false, false) closure := util.FindClosure(line[pos+1:], '[', ']', false, false) //nolint:staticcheck
closes = pos + 1 + closure closes = pos + 1 + closure
next := closes + 1 next := closes + 1
if closure > -1 { if closure > -1 {
@ -136,7 +136,7 @@ func (s *footnoteParser) Parse(parent gast.Node, block text.Reader, pc parser.Co
return nil return nil
} }
open := pos open := pos
closure := util.FindClosure(line[pos:], '[', ']', false, false) closure := util.FindClosure(line[pos:], '[', ']', false, false) //nolint:staticcheck
if closure < 0 { if closure < 0 {
return nil return nil
} }
@ -156,7 +156,7 @@ func (s *footnoteParser) Parse(parent gast.Node, block text.Reader, pc parser.Co
d := def.(*ast.Footnote) d := def.(*ast.Footnote)
if bytes.Equal(d.Ref, value) { if bytes.Equal(d.Ref, value) {
if d.Index < 0 { if d.Index < 0 {
list.Count += 1 list.Count++
d.Index = list.Count d.Index = list.Count
} }
index = d.Index index = d.Index
@ -272,9 +272,9 @@ func (a *footnoteASTTransformer) Transform(node *gast.Document, reader text.Read
// FootnoteConfig holds configuration values for the footnote extension. // FootnoteConfig holds configuration values for the footnote extension.
// //
// Link* and Backlink* configurations have some variables: // Link* and Backlink* configurations have some variables:
// Occurrances of “^^” in the string will be replaced by the // Occurrences of “^^” in the string will be replaced by the
// corresponding footnote number in the HTML output. // corresponding footnote number in the HTML output.
// Occurrances of “%%” will be replaced by a number for the // Occurrences of “%%” will be replaced by a number for the
// reference (footnotes can have multiple references). // reference (footnotes can have multiple references).
type FootnoteConfig struct { type FootnoteConfig struct {
html.Config html.Config
@ -382,8 +382,8 @@ func (o *withFootnoteIDPrefix) SetFootnoteOption(c *FootnoteConfig) {
} }
// WithFootnoteIDPrefix is a functional option that is a prefix for the id attributes generated by footnotes. // WithFootnoteIDPrefix is a functional option that is a prefix for the id attributes generated by footnotes.
func WithFootnoteIDPrefix(a []byte) FootnoteOption { func WithFootnoteIDPrefix[T []byte | string](a T) FootnoteOption {
return &withFootnoteIDPrefix{a} return &withFootnoteIDPrefix{[]byte(a)}
} }
const optFootnoteIDPrefixFunction renderer.OptionName = "FootnoteIDPrefixFunction" const optFootnoteIDPrefixFunction renderer.OptionName = "FootnoteIDPrefixFunction"
@ -420,8 +420,8 @@ func (o *withFootnoteLinkTitle) SetFootnoteOption(c *FootnoteConfig) {
} }
// WithFootnoteLinkTitle is a functional option that is an optional title attribute for footnote links. // WithFootnoteLinkTitle is a functional option that is an optional title attribute for footnote links.
func WithFootnoteLinkTitle(a []byte) FootnoteOption { func WithFootnoteLinkTitle[T []byte | string](a T) FootnoteOption {
return &withFootnoteLinkTitle{a} return &withFootnoteLinkTitle{[]byte(a)}
} }
const optFootnoteBacklinkTitle renderer.OptionName = "FootnoteBacklinkTitle" const optFootnoteBacklinkTitle renderer.OptionName = "FootnoteBacklinkTitle"
@ -439,8 +439,8 @@ func (o *withFootnoteBacklinkTitle) SetFootnoteOption(c *FootnoteConfig) {
} }
// WithFootnoteBacklinkTitle is a functional option that is an optional title attribute for footnote backlinks. // WithFootnoteBacklinkTitle is a functional option that is an optional title attribute for footnote backlinks.
func WithFootnoteBacklinkTitle(a []byte) FootnoteOption { func WithFootnoteBacklinkTitle[T []byte | string](a T) FootnoteOption {
return &withFootnoteBacklinkTitle{a} return &withFootnoteBacklinkTitle{[]byte(a)}
} }
const optFootnoteLinkClass renderer.OptionName = "FootnoteLinkClass" const optFootnoteLinkClass renderer.OptionName = "FootnoteLinkClass"
@ -458,8 +458,8 @@ func (o *withFootnoteLinkClass) SetFootnoteOption(c *FootnoteConfig) {
} }
// WithFootnoteLinkClass is a functional option that is a class for footnote links. // WithFootnoteLinkClass is a functional option that is a class for footnote links.
func WithFootnoteLinkClass(a []byte) FootnoteOption { func WithFootnoteLinkClass[T []byte | string](a T) FootnoteOption {
return &withFootnoteLinkClass{a} return &withFootnoteLinkClass{[]byte(a)}
} }
const optFootnoteBacklinkClass renderer.OptionName = "FootnoteBacklinkClass" const optFootnoteBacklinkClass renderer.OptionName = "FootnoteBacklinkClass"
@ -477,8 +477,8 @@ func (o *withFootnoteBacklinkClass) SetFootnoteOption(c *FootnoteConfig) {
} }
// WithFootnoteBacklinkClass is a functional option that is a class for footnote backlinks. // WithFootnoteBacklinkClass is a functional option that is a class for footnote backlinks.
func WithFootnoteBacklinkClass(a []byte) FootnoteOption { func WithFootnoteBacklinkClass[T []byte | string](a T) FootnoteOption {
return &withFootnoteBacklinkClass{a} return &withFootnoteBacklinkClass{[]byte(a)}
} }
const optFootnoteBacklinkHTML renderer.OptionName = "FootnoteBacklinkHTML" const optFootnoteBacklinkHTML renderer.OptionName = "FootnoteBacklinkHTML"
@ -496,8 +496,8 @@ func (o *withFootnoteBacklinkHTML) SetFootnoteOption(c *FootnoteConfig) {
} }
// WithFootnoteBacklinkHTML is an HTML content for footnote backlinks. // WithFootnoteBacklinkHTML is an HTML content for footnote backlinks.
func WithFootnoteBacklinkHTML(a []byte) FootnoteOption { func WithFootnoteBacklinkHTML[T []byte | string](a T) FootnoteOption {
return &withFootnoteBacklinkHTML{a} return &withFootnoteBacklinkHTML{[]byte(a)}
} }
// FootnoteHTMLRenderer is a renderer.NodeRenderer implementation that // FootnoteHTMLRenderer is a renderer.NodeRenderer implementation that
@ -525,7 +525,8 @@ func (r *FootnoteHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFuncRegist
reg.Register(ast.KindFootnoteList, r.renderFootnoteList) reg.Register(ast.KindFootnoteList, r.renderFootnoteList)
} }
func (r *FootnoteHTMLRenderer) renderFootnoteLink(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) { func (r *FootnoteHTMLRenderer) renderFootnoteLink(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
n := node.(*ast.FootnoteLink) n := node.(*ast.FootnoteLink)
is := strconv.Itoa(n.Index) is := strconv.Itoa(n.Index)
@ -556,7 +557,8 @@ func (r *FootnoteHTMLRenderer) renderFootnoteLink(w util.BufWriter, source []byt
return gast.WalkContinue, nil return gast.WalkContinue, nil
} }
func (r *FootnoteHTMLRenderer) renderFootnoteBacklink(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) { func (r *FootnoteHTMLRenderer) renderFootnoteBacklink(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
n := node.(*ast.FootnoteBacklink) n := node.(*ast.FootnoteBacklink)
is := strconv.Itoa(n.Index) is := strconv.Itoa(n.Index)
@ -581,7 +583,8 @@ func (r *FootnoteHTMLRenderer) renderFootnoteBacklink(w util.BufWriter, source [
return gast.WalkContinue, nil return gast.WalkContinue, nil
} }
func (r *FootnoteHTMLRenderer) renderFootnote(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) { func (r *FootnoteHTMLRenderer) renderFootnote(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
n := node.(*ast.Footnote) n := node.(*ast.Footnote)
is := strconv.Itoa(n.Index) is := strconv.Itoa(n.Index)
if entering { if entering {
@ -600,7 +603,8 @@ func (r *FootnoteHTMLRenderer) renderFootnote(w util.BufWriter, source []byte, n
return gast.WalkContinue, nil return gast.WalkContinue, nil
} }
func (r *FootnoteHTMLRenderer) renderFootnoteList(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) { func (r *FootnoteHTMLRenderer) renderFootnoteList(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
_, _ = w.WriteString(`<div class="footnotes" role="doc-endnotes"`) _, _ = w.WriteString(`<div class="footnotes" role="doc-endnotes"`)
if node.Attributes() != nil { if node.Attributes() != nil {

View file

@ -38,12 +38,12 @@ func TestFootnoteOptions(t *testing.T) {
), ),
goldmark.WithExtensions( goldmark.WithExtensions(
NewFootnote( NewFootnote(
WithFootnoteIDPrefix([]byte("article12-")), WithFootnoteIDPrefix("article12-"),
WithFootnoteLinkClass([]byte("link-class")), WithFootnoteLinkClass("link-class"),
WithFootnoteBacklinkClass([]byte("backlink-class")), WithFootnoteBacklinkClass("backlink-class"),
WithFootnoteLinkTitle([]byte("link-title-%%-^^")), WithFootnoteLinkTitle("link-title-%%-^^"),
WithFootnoteBacklinkTitle([]byte("backlink-title")), WithFootnoteBacklinkTitle("backlink-title"),
WithFootnoteBacklinkHTML([]byte("^")), WithFootnoteBacklinkHTML("^"),
), ),
), ),
) )

View file

@ -11,9 +11,9 @@ import (
"github.com/yuin/goldmark/util" "github.com/yuin/goldmark/util"
) )
var wwwURLRegxp = regexp.MustCompile(`^www\.[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-z]+(?:[/#?][-a-zA-Z0-9@:%_\+.~#!?&/=\(\);,'">\^{}\[\]` + "`" + `]*)?`) var wwwURLRegxp = regexp.MustCompile(`^www\.[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-z]+(?:[/#?][-a-zA-Z0-9@:%_\+.~#!?&/=\(\);,'">\^{}\[\]` + "`" + `]*)?`) //nolint:golint,lll
var urlRegexp = regexp.MustCompile(`^(?:http|https|ftp)://[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-z]+(?::\d+)?(?:[/#?][-a-zA-Z0-9@:%_+.~#$!?&/=\(\);,'">\^{}\[\]` + "`" + `]*)?`) var urlRegexp = regexp.MustCompile(`^(?:http|https|ftp)://[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-z]+(?::\d+)?(?:[/#?][-a-zA-Z0-9@:%_+.~#$!?&/=\(\);,'">\^{}\[\]` + "`" + `]*)?`) //nolint:golint,lll
// An LinkifyConfig struct is a data structure that holds configuration of the // An LinkifyConfig struct is a data structure that holds configuration of the
// Linkify extension. // Linkify extension.
@ -66,10 +66,12 @@ func (o *withLinkifyAllowedProtocols) SetLinkifyOption(p *LinkifyConfig) {
// WithLinkifyAllowedProtocols is a functional option that specify allowed // WithLinkifyAllowedProtocols is a functional option that specify allowed
// protocols in autolinks. Each protocol must end with ':' like // protocols in autolinks. Each protocol must end with ':' like
// 'http:' . // 'http:' .
func WithLinkifyAllowedProtocols(value [][]byte) LinkifyOption { func WithLinkifyAllowedProtocols[T []byte | string](value []T) LinkifyOption {
return &withLinkifyAllowedProtocols{ opt := &withLinkifyAllowedProtocols{}
value: value, for _, v := range value {
opt.value = append(opt.value, []byte(v))
} }
return opt
} }
type withLinkifyURLRegexp struct { type withLinkifyURLRegexp struct {
@ -92,9 +94,6 @@ func WithLinkifyURLRegexp(value *regexp.Regexp) LinkifyOption {
} }
} }
// WithLinkifyWWWRegexp is a functional option that specify
// a pattern of the URL without a protocol.
// This pattern must start with 'www.' .
type withLinkifyWWWRegexp struct { type withLinkifyWWWRegexp struct {
value *regexp.Regexp value *regexp.Regexp
} }
@ -107,14 +106,15 @@ func (o *withLinkifyWWWRegexp) SetLinkifyOption(p *LinkifyConfig) {
p.WWWRegexp = o.value p.WWWRegexp = o.value
} }
// WithLinkifyWWWRegexp is a functional option that specify
// a pattern of the URL without a protocol.
// This pattern must start with 'www.' .
func WithLinkifyWWWRegexp(value *regexp.Regexp) LinkifyOption { func WithLinkifyWWWRegexp(value *regexp.Regexp) LinkifyOption {
return &withLinkifyWWWRegexp{ return &withLinkifyWWWRegexp{
value: value, value: value,
} }
} }
// WithLinkifyWWWRegexp is a functional otpion that specify
// a pattern of the email address.
type withLinkifyEmailRegexp struct { type withLinkifyEmailRegexp struct {
value *regexp.Regexp value *regexp.Regexp
} }
@ -127,6 +127,8 @@ func (o *withLinkifyEmailRegexp) SetLinkifyOption(p *LinkifyConfig) {
p.EmailRegexp = o.value p.EmailRegexp = o.value
} }
// WithLinkifyEmailRegexp is a functional otpion that specify
// a pattern of the email address.
func WithLinkifyEmailRegexp(value *regexp.Regexp) LinkifyOption { func WithLinkifyEmailRegexp(value *regexp.Regexp) LinkifyOption {
return &withLinkifyEmailRegexp{ return &withLinkifyEmailRegexp{
value: value, value: value,
@ -303,6 +305,8 @@ type linkify struct {
// Linkify is an extension that allow you to parse text that seems like a URL. // Linkify is an extension that allow you to parse text that seems like a URL.
var Linkify = &linkify{} var Linkify = &linkify{}
// NewLinkify creates a new [goldmark.Extender] that
// allow you to parse text that seems like a URL.
func NewLinkify(opts ...LinkifyOption) goldmark.Extender { func NewLinkify(opts ...LinkifyOption) goldmark.Extender {
return &linkify{ return &linkify{
options: opts, options: opts,

View file

@ -29,8 +29,8 @@ func TestLinkifyWithAllowedProtocols(t *testing.T) {
), ),
goldmark.WithExtensions( goldmark.WithExtensions(
NewLinkify( NewLinkify(
WithLinkifyAllowedProtocols([][]byte{ WithLinkifyAllowedProtocols([]string{
[]byte("ssh:"), "ssh:",
}), }),
WithLinkifyURLRegexp( WithLinkifyURLRegexp(
regexp.MustCompile(`\w+://[^\s]+`), regexp.MustCompile(`\w+://[^\s]+`),

2
extension/package.go Normal file
View file

@ -0,0 +1,2 @@
// Package extension is a collection of builtin extensions.
package extension

View file

@ -46,10 +46,11 @@ func (s *strikethroughParser) Trigger() []byte {
func (s *strikethroughParser) Parse(parent gast.Node, block text.Reader, pc parser.Context) gast.Node { func (s *strikethroughParser) Parse(parent gast.Node, block text.Reader, pc parser.Context) gast.Node {
before := block.PrecendingCharacter() before := block.PrecendingCharacter()
line, segment := block.PeekLine() line, segment := block.PeekLine()
node := parser.ScanDelimiter(line, before, 2, defaultStrikethroughDelimiterProcessor) node := parser.ScanDelimiter(line, before, 1, defaultStrikethroughDelimiterProcessor)
if node == nil { if node == nil || node.OriginalLength > 2 || before == '~' {
return nil return nil
} }
node.Segment = segment.WithStop(segment.Start + node.OriginalLength) node.Segment = segment.WithStop(segment.Start + node.OriginalLength)
block.Advance(node.OriginalLength) block.Advance(node.OriginalLength)
pc.PushDelimiter(node) pc.PushDelimiter(node)
@ -85,7 +86,8 @@ func (r *StrikethroughHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFuncR
// StrikethroughAttributeFilter defines attribute names which dd elements can have. // StrikethroughAttributeFilter defines attribute names which dd elements can have.
var StrikethroughAttributeFilter = html.GlobalAttributeFilter var StrikethroughAttributeFilter = html.GlobalAttributeFilter
func (r *StrikethroughHTMLRenderer) renderStrikethrough(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) { func (r *StrikethroughHTMLRenderer) renderStrikethrough(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
if n.Attributes() != nil { if n.Attributes() != nil {
_, _ = w.WriteString("<del") _, _ = w.WriteString("<del")

View file

@ -23,7 +23,7 @@ type escapedPipeCell struct {
Transformed bool Transformed bool
} }
// TableCellAlignMethod indicates how are table cells aligned in HTML format.indicates how are table cells aligned in HTML format. // TableCellAlignMethod indicates how are table cells aligned in HTML format.
type TableCellAlignMethod int type TableCellAlignMethod int
const ( const (
@ -181,13 +181,14 @@ func (b *tableParagraphTransformer) Transform(node *gast.Paragraph, reader text.
} }
} }
func (b *tableParagraphTransformer) parseRow(segment text.Segment, alignments []ast.Alignment, isHeader bool, reader text.Reader, pc parser.Context) *ast.TableRow { func (b *tableParagraphTransformer) parseRow(segment text.Segment,
alignments []ast.Alignment, isHeader bool, reader text.Reader, pc parser.Context) *ast.TableRow {
source := reader.Source() source := reader.Source()
segment = segment.TrimLeftSpace(source)
segment = segment.TrimRightSpace(source)
line := segment.Value(source) line := segment.Value(source)
pos := 0 pos := 0
pos += util.TrimLeftSpaceLength(line)
limit := len(line) limit := len(line)
limit -= util.TrimRightSpaceLength(line)
row := ast.NewTableRow(alignments) row := ast.NewTableRow(alignments)
if len(line) > 0 && line[pos] == '|' { if len(line) > 0 && line[pos] == '|' {
pos++ pos++
@ -369,7 +370,8 @@ var TableAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("width"), // [Deprecated] []byte("width"), // [Deprecated]
) )
func (r *TableHTMLRenderer) renderTable(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) { func (r *TableHTMLRenderer) renderTable(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
_, _ = w.WriteString("<table") _, _ = w.WriteString("<table")
if n.Attributes() != nil { if n.Attributes() != nil {
@ -391,7 +393,8 @@ var TableHeaderAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("valign"), // [Deprecated since HTML4] [Obsolete since HTML5] []byte("valign"), // [Deprecated since HTML4] [Obsolete since HTML5]
) )
func (r *TableHTMLRenderer) renderTableHeader(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) { func (r *TableHTMLRenderer) renderTableHeader(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
_, _ = w.WriteString("<thead") _, _ = w.WriteString("<thead")
if n.Attributes() != nil { if n.Attributes() != nil {
@ -418,7 +421,8 @@ var TableRowAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("valign"), // [Obsolete since HTML5] []byte("valign"), // [Obsolete since HTML5]
) )
func (r *TableHTMLRenderer) renderTableRow(w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) { func (r *TableHTMLRenderer) renderTableRow(
w util.BufWriter, source []byte, n gast.Node, entering bool) (gast.WalkStatus, error) {
if entering { if entering {
_, _ = w.WriteString("<tr") _, _ = w.WriteString("<tr")
if n.Attributes() != nil { if n.Attributes() != nil {
@ -445,12 +449,14 @@ var TableThCellAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("charoff"), // [Obsolete since HTML5] []byte("charoff"), // [Obsolete since HTML5]
[]byte("colspan"), // [OK] Number of columns that the cell is to span []byte("colspan"), // [OK] Number of columns that the cell is to span
[]byte("headers"), // [OK] This attribute contains a list of space-separated strings, each corresponding to the id attribute of the <th> elements that apply to this element []byte("headers"), // [OK] This attribute contains a list of space-separated
// strings, each corresponding to the id attribute of the <th> elements that apply to this element
[]byte("height"), // [Deprecated since HTML4] [Obsolete since HTML5] []byte("height"), // [Deprecated since HTML4] [Obsolete since HTML5]
[]byte("rowspan"), // [OK] Number of rows that the cell is to span []byte("rowspan"), // [OK] Number of rows that the cell is to span
[]byte("scope"), // [OK] This enumerated attribute defines the cells that the header (defined in the <th>) element relates to [NOT OK in <td>] []byte("scope"), // [OK] This enumerated attribute defines the cells that
// the header (defined in the <th>) element relates to [NOT OK in <td>]
[]byte("valign"), // [Obsolete since HTML5] []byte("valign"), // [Obsolete since HTML5]
[]byte("width"), // [Deprecated since HTML4] [Obsolete since HTML5] []byte("width"), // [Deprecated since HTML4] [Obsolete since HTML5]
@ -466,7 +472,8 @@ var TableTdCellAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("charoff"), // [Obsolete since HTML5] []byte("charoff"), // [Obsolete since HTML5]
[]byte("colspan"), // [OK] Number of columns that the cell is to span []byte("colspan"), // [OK] Number of columns that the cell is to span
[]byte("headers"), // [OK] This attribute contains a list of space-separated strings, each corresponding to the id attribute of the <th> elements that apply to this element []byte("headers"), // [OK] This attribute contains a list of space-separated
// strings, each corresponding to the id attribute of the <th> elements that apply to this element
[]byte("height"), // [Deprecated since HTML4] [Obsolete since HTML5] []byte("height"), // [Deprecated since HTML4] [Obsolete since HTML5]
@ -477,14 +484,15 @@ var TableTdCellAttributeFilter = html.GlobalAttributeFilter.Extend(
[]byte("width"), // [Deprecated since HTML4] [Obsolete since HTML5] []byte("width"), // [Deprecated since HTML4] [Obsolete since HTML5]
) )
func (r *TableHTMLRenderer) renderTableCell(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) { func (r *TableHTMLRenderer) renderTableCell(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
n := node.(*ast.TableCell) n := node.(*ast.TableCell)
tag := "td" tag := "td"
if n.Parent().Kind() == ast.KindTableHeader { if n.Parent().Kind() == ast.KindTableHeader {
tag = "th" tag = "th"
} }
if entering { if entering {
fmt.Fprintf(w, "<%s", tag) _, _ = fmt.Fprintf(w, "<%s", tag)
if n.Alignment != ast.AlignNone { if n.Alignment != ast.AlignNone {
amethod := r.TableConfig.TableCellAlignMethod amethod := r.TableConfig.TableCellAlignMethod
if amethod == TableCellAlignDefault { if amethod == TableCellAlignDefault {
@ -497,7 +505,7 @@ func (r *TableHTMLRenderer) renderTableCell(w util.BufWriter, source []byte, nod
switch amethod { switch amethod {
case TableCellAlignAttribute: case TableCellAlignAttribute:
if _, ok := n.AttributeString("align"); !ok { // Skip align render if overridden if _, ok := n.AttributeString("align"); !ok { // Skip align render if overridden
fmt.Fprintf(w, ` align="%s"`, n.Alignment.String()) _, _ = fmt.Fprintf(w, ` align="%s"`, n.Alignment.String())
} }
case TableCellAlignStyle: case TableCellAlignStyle:
v, ok := n.AttributeString("style") v, ok := n.AttributeString("style")
@ -520,7 +528,7 @@ func (r *TableHTMLRenderer) renderTableCell(w util.BufWriter, source []byte, nod
} }
_ = w.WriteByte('>') _ = w.WriteByte('>')
} else { } else {
fmt.Fprintf(w, "</%s>\n", tag) _, _ = fmt.Fprintf(w, "</%s>\n", tag)
} }
return gast.WalkContinue, nil return gast.WalkContinue, nil
} }

View file

@ -355,3 +355,40 @@ bar | baz
t, t,
) )
} }
func TestTableFuzzedPanics(t *testing.T) {
markdown := goldmark.New(
goldmark.WithRendererOptions(
html.WithXHTML(),
html.WithUnsafe(),
),
goldmark.WithExtensions(
NewTable(),
),
)
testutil.DoTestCase(
markdown,
testutil.MarkdownTestCase{
No: 1,
Description: "This should not panic",
Markdown: "* 0\n-|\n\t0",
Expected: `<ul>
<li>
<table>
<thead>
<tr>
<th>0</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
</tr>
</tbody>
</table>
</li>
</ul>`,
},
t,
)
}

View file

@ -1,6 +1,8 @@
package extension package extension
import ( import (
"regexp"
"github.com/yuin/goldmark" "github.com/yuin/goldmark"
gast "github.com/yuin/goldmark/ast" gast "github.com/yuin/goldmark/ast"
"github.com/yuin/goldmark/extension/ast" "github.com/yuin/goldmark/extension/ast"
@ -9,7 +11,6 @@ import (
"github.com/yuin/goldmark/renderer/html" "github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/text" "github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util" "github.com/yuin/goldmark/util"
"regexp"
) )
var taskListRegexp = regexp.MustCompile(`^\[([\sxX])\]\s*`) var taskListRegexp = regexp.MustCompile(`^\[([\sxX])\]\s*`)
@ -40,6 +41,9 @@ func (s *taskCheckBoxParser) Parse(parent gast.Node, block text.Reader, pc parse
return nil return nil
} }
if parent.HasChildren() {
return nil
}
if _, ok := parent.Parent().(*gast.ListItem); !ok { if _, ok := parent.Parent().(*gast.ListItem); !ok {
return nil return nil
} }
@ -80,21 +84,22 @@ func (r *TaskCheckBoxHTMLRenderer) RegisterFuncs(reg renderer.NodeRendererFuncRe
reg.Register(ast.KindTaskCheckBox, r.renderTaskCheckBox) reg.Register(ast.KindTaskCheckBox, r.renderTaskCheckBox)
} }
func (r *TaskCheckBoxHTMLRenderer) renderTaskCheckBox(w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) { func (r *TaskCheckBoxHTMLRenderer) renderTaskCheckBox(
w util.BufWriter, source []byte, node gast.Node, entering bool) (gast.WalkStatus, error) {
if !entering { if !entering {
return gast.WalkContinue, nil return gast.WalkContinue, nil
} }
n := node.(*ast.TaskCheckBox) n := node.(*ast.TaskCheckBox)
if n.IsChecked { if n.IsChecked {
w.WriteString(`<input checked="" disabled="" type="checkbox"`) _, _ = w.WriteString(`<input checked="" disabled="" type="checkbox"`)
} else { } else {
w.WriteString(`<input disabled="" type="checkbox"`) _, _ = w.WriteString(`<input disabled="" type="checkbox"`)
} }
if r.XHTML { if r.XHTML {
w.WriteString(" /> ") _, _ = w.WriteString(" /> ")
} else { } else {
w.WriteString("> ") _, _ = w.WriteString("> ")
} }
return gast.WalkContinue, nil return gast.WalkContinue, nil
} }

View file

@ -36,25 +36,25 @@ func getUnclosedCounter(pc parser.Context) *unclosedCounter {
type TypographicPunctuation int type TypographicPunctuation int
const ( const (
// LeftSingleQuote is ' // LeftSingleQuote is ' .
LeftSingleQuote TypographicPunctuation = iota + 1 LeftSingleQuote TypographicPunctuation = iota + 1
// RightSingleQuote is ' // RightSingleQuote is ' .
RightSingleQuote RightSingleQuote
// LeftDoubleQuote is " // LeftDoubleQuote is " .
LeftDoubleQuote LeftDoubleQuote
// RightDoubleQuote is " // RightDoubleQuote is " .
RightDoubleQuote RightDoubleQuote
// EnDash is -- // EnDash is -- .
EnDash EnDash
// EmDash is --- // EmDash is --- .
EmDash EmDash
// Ellipsis is ... // Ellipsis is ... .
Ellipsis Ellipsis
// LeftAngleQuote is << // LeftAngleQuote is << .
LeftAngleQuote LeftAngleQuote
// RightAngleQuote is >> // RightAngleQuote is >> .
RightAngleQuote RightAngleQuote
// Apostrophe is ' // Apostrophe is ' .
Apostrophe Apostrophe
typographicPunctuationMax typographicPunctuationMax
@ -115,10 +115,10 @@ func (o *withTypographicSubstitutions) SetTypographerOption(p *TypographerConfig
// WithTypographicSubstitutions is a functional otpion that specify replacement text // WithTypographicSubstitutions is a functional otpion that specify replacement text
// for punctuations. // for punctuations.
func WithTypographicSubstitutions(values map[TypographicPunctuation][]byte) TypographerOption { func WithTypographicSubstitutions[T []byte | string](values map[TypographicPunctuation]T) TypographerOption {
replacements := newDefaultSubstitutions() replacements := newDefaultSubstitutions()
for k, v := range values { for k, v := range values {
replacements[k] = v replacements[k] = []byte(v)
} }
return &withTypographicSubstitutions{replacements} return &withTypographicSubstitutions{replacements}
@ -218,7 +218,8 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
if c == '\'' { if c == '\'' {
if s.Substitutions[Apostrophe] != nil { if s.Substitutions[Apostrophe] != nil {
// Handle decade abbrevations such as '90s // Handle decade abbrevations such as '90s
if d.CanOpen && !d.CanClose && len(line) > 3 && util.IsNumeric(line[1]) && util.IsNumeric(line[2]) && line[3] == 's' { if d.CanOpen && !d.CanClose && len(line) > 3 &&
util.IsNumeric(line[1]) && util.IsNumeric(line[2]) && line[3] == 's' {
after := rune(' ') after := rune(' ')
if len(line) > 4 { if len(line) > 4 {
after = util.ToRune(line, 4) after = util.ToRune(line, 4)
@ -231,7 +232,8 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
} }
} }
// special cases: 'twas, 'em, 'net // special cases: 'twas, 'em, 'net
if len(line) > 1 && (unicode.IsPunct(before) || unicode.IsSpace(before)) && (line[1] == 't' || line[1] == 'e' || line[1] == 'n' || line[1] == 'l') { if len(line) > 1 && (unicode.IsPunct(before) || unicode.IsSpace(before)) &&
(line[1] == 't' || line[1] == 'e' || line[1] == 'n' || line[1] == 'l') {
node := gast.NewString(s.Substitutions[Apostrophe]) node := gast.NewString(s.Substitutions[Apostrophe])
node.SetCode(true) node.SetCode(true)
block.Advance(1) block.Advance(1)
@ -239,7 +241,8 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
} }
// Convert normal apostrophes. This is probably more flexible than necessary but // Convert normal apostrophes. This is probably more flexible than necessary but
// converts any apostrophe in between two alphanumerics. // converts any apostrophe in between two alphanumerics.
if len(line) > 1 && (unicode.IsDigit(before) || unicode.IsLetter(before)) && (unicode.IsLetter(util.ToRune(line, 1))) { if len(line) > 1 && (unicode.IsDigit(before) || unicode.IsLetter(before)) &&
(unicode.IsLetter(util.ToRune(line, 1))) {
node := gast.NewString(s.Substitutions[Apostrophe]) node := gast.NewString(s.Substitutions[Apostrophe])
node.SetCode(true) node.SetCode(true)
block.Advance(1) block.Advance(1)
@ -249,11 +252,14 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
if s.Substitutions[LeftSingleQuote] != nil && d.CanOpen && !d.CanClose { if s.Substitutions[LeftSingleQuote] != nil && d.CanOpen && !d.CanClose {
nt := LeftSingleQuote nt := LeftSingleQuote
// special cases: Alice's, I'm, Don't, You'd // special cases: Alice's, I'm, Don't, You'd
if len(line) > 1 && (line[1] == 's' || line[1] == 'm' || line[1] == 't' || line[1] == 'd') && (len(line) < 3 || util.IsPunct(line[2]) || util.IsSpace(line[2])) { if len(line) > 1 && (line[1] == 's' || line[1] == 'm' || line[1] == 't' || line[1] == 'd') &&
(len(line) < 3 || util.IsPunct(line[2]) || util.IsSpace(line[2])) {
nt = RightSingleQuote nt = RightSingleQuote
} }
// special cases: I've, I'll, You're // special cases: I've, I'll, You're
if len(line) > 2 && ((line[1] == 'v' && line[2] == 'e') || (line[1] == 'l' && line[2] == 'l') || (line[1] == 'r' && line[2] == 'e')) && (len(line) < 4 || util.IsPunct(line[3]) || util.IsSpace(line[3])) { if len(line) > 2 && ((line[1] == 'v' && line[2] == 'e') ||
(line[1] == 'l' && line[2] == 'l') || (line[1] == 'r' && line[2] == 'e')) &&
(len(line) < 4 || util.IsPunct(line[3]) || util.IsSpace(line[3])) {
nt = RightSingleQuote nt = RightSingleQuote
} }
if nt == LeftSingleQuote { if nt == LeftSingleQuote {
@ -266,8 +272,9 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
return node return node
} }
if s.Substitutions[RightSingleQuote] != nil { if s.Substitutions[RightSingleQuote] != nil {
// plural possesives and abbreviations: Smiths', doin' // plural possesive and abbreviations: Smiths', doin'
if len(line) > 1 && unicode.IsSpace(util.ToRune(line, 0)) || unicode.IsPunct(util.ToRune(line, 0)) && (len(line) > 2 && !unicode.IsDigit(util.ToRune(line, 1))) { if len(line) > 1 && unicode.IsSpace(util.ToRune(line, 0)) || unicode.IsPunct(util.ToRune(line, 0)) &&
(len(line) > 2 && !unicode.IsDigit(util.ToRune(line, 1))) {
node := gast.NewString(s.Substitutions[RightSingleQuote]) node := gast.NewString(s.Substitutions[RightSingleQuote])
node.SetCode(true) node.SetCode(true)
block.Advance(1) block.Advance(1)
@ -276,7 +283,8 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
} }
if s.Substitutions[RightSingleQuote] != nil && counter.Single > 0 { if s.Substitutions[RightSingleQuote] != nil && counter.Single > 0 {
isClose := d.CanClose && !d.CanOpen isClose := d.CanClose && !d.CanOpen
maybeClose := d.CanClose && d.CanOpen && len(line) > 1 && unicode.IsPunct(util.ToRune(line, 1)) && (len(line) == 2 || (len(line) > 2 && util.IsPunct(line[2]) || util.IsSpace(line[2]))) maybeClose := d.CanClose && d.CanOpen && len(line) > 1 && unicode.IsPunct(util.ToRune(line, 1)) &&
(len(line) == 2 || (len(line) > 2 && util.IsPunct(line[2]) || util.IsSpace(line[2])))
if isClose || maybeClose { if isClose || maybeClose {
node := gast.NewString(s.Substitutions[RightSingleQuote]) node := gast.NewString(s.Substitutions[RightSingleQuote])
node.SetCode(true) node.SetCode(true)
@ -296,7 +304,8 @@ func (s *typographerParser) Parse(parent gast.Node, block text.Reader, pc parser
} }
if s.Substitutions[RightDoubleQuote] != nil && counter.Double > 0 { if s.Substitutions[RightDoubleQuote] != nil && counter.Double > 0 {
isClose := d.CanClose && !d.CanOpen isClose := d.CanClose && !d.CanOpen
maybeClose := d.CanClose && d.CanOpen && len(line) > 1 && (unicode.IsPunct(util.ToRune(line, 1))) && (len(line) == 2 || (len(line) > 2 && util.IsPunct(line[2]) || util.IsSpace(line[2]))) maybeClose := d.CanClose && d.CanOpen && len(line) > 1 && (unicode.IsPunct(util.ToRune(line, 1))) &&
(len(line) == 2 || (len(line) > 2 && util.IsPunct(line[2]) || util.IsSpace(line[2])))
if isClose || maybeClose { if isClose || maybeClose {
// special case: "Monitor 21"" // special case: "Monitor 21""
if len(line) > 1 && line[1] == '"' && unicode.IsDigit(before) { if len(line) > 1 && line[1] == '"' && unicode.IsDigit(before) {

View file

@ -3,7 +3,7 @@ package fuzz
import ( import (
"bytes" "bytes"
"encoding/json" "encoding/json"
"io/ioutil" "os"
"testing" "testing"
"github.com/yuin/goldmark" "github.com/yuin/goldmark"
@ -42,7 +42,7 @@ func fuzz(f *testing.F) {
} }
func FuzzDefault(f *testing.F) { func FuzzDefault(f *testing.F) {
bs, err := ioutil.ReadFile("../_test/spec.json") bs, err := os.ReadFile("../_test/spec.json")
if err != nil { if err != nil {
panic(err) panic(err)
} }

2
go.mod
View file

@ -1,3 +1,3 @@
module github.com/yuin/goldmark module github.com/yuin/goldmark
go 1.18 go 1.19

View file

@ -2,12 +2,13 @@
package goldmark package goldmark
import ( import (
"io"
"github.com/yuin/goldmark/parser" "github.com/yuin/goldmark/parser"
"github.com/yuin/goldmark/renderer" "github.com/yuin/goldmark/renderer"
"github.com/yuin/goldmark/renderer/html" "github.com/yuin/goldmark/renderer/html"
"github.com/yuin/goldmark/text" "github.com/yuin/goldmark/text"
"github.com/yuin/goldmark/util" "github.com/yuin/goldmark/util"
"io"
) )
// DefaultParser returns a new Parser that is configured by default values. // DefaultParser returns a new Parser that is configured by default values.

View file

@ -12,7 +12,7 @@ import (
var attrNameID = []byte("id") var attrNameID = []byte("id")
var attrNameClass = []byte("class") var attrNameClass = []byte("class")
// An Attribute is an attribute of the markdown elements // An Attribute is an attribute of the markdown elements.
type Attribute struct { type Attribute struct {
Name []byte Name []byte
Value interface{} Value interface{}
@ -93,7 +93,8 @@ func parseAttribute(reader text.Reader) (Attribute, bool) {
// CommonMark is basically defined for XHTML(even though it is legacy). // CommonMark is basically defined for XHTML(even though it is legacy).
// So we restrict id characters. // So we restrict id characters.
for ; i < len(line) && !util.IsSpace(line[i]) && for ; i < len(line) && !util.IsSpace(line[i]) &&
(!util.IsPunct(line[i]) || line[i] == '_' || line[i] == '-' || line[i] == ':' || line[i] == '.'); i++ { (!util.IsPunct(line[i]) || line[i] == '_' ||
line[i] == '-' || line[i] == ':' || line[i] == '.'); i++ {
} }
name := attrNameClass name := attrNameClass
if c == '#' { if c == '#' {
@ -145,7 +146,7 @@ func parseAttributeValue(reader text.Reader) (interface{}, bool) {
reader.SkipSpaces() reader.SkipSpaces()
c := reader.Peek() c := reader.Peek()
var value interface{} var value interface{}
ok := false var ok bool
switch c { switch c {
case text.EOF: case text.EOF:
return Attribute{}, false return Attribute{}, false
@ -244,7 +245,7 @@ func scanAttributeDecimal(reader text.Reader, w io.ByteWriter) {
for { for {
c := reader.Peek() c := reader.Peek()
if util.IsNumeric(c) { if util.IsNumeric(c) {
w.WriteByte(c) _ = w.WriteByte(c)
} else { } else {
return return
} }
@ -286,7 +287,7 @@ func parseAttributeNumber(reader text.Reader) (float64, bool) {
} }
scanAttributeDecimal(reader, &buf) scanAttributeDecimal(reader, &buf)
} }
f, err := strconv.ParseFloat(buf.String(), 10) f, err := strconv.ParseFloat(buf.String(), 64)
if err != nil { if err != nil {
return 0, false return 0, false
} }

View file

@ -13,7 +13,7 @@ type HeadingConfig struct {
} }
// SetOption implements SetOptioner. // SetOption implements SetOptioner.
func (b *HeadingConfig) SetOption(name OptionName, value interface{}) { func (b *HeadingConfig) SetOption(name OptionName, _ interface{}) {
switch name { switch name {
case optAutoHeadingID: case optAutoHeadingID:
b.AutoHeadingID = true b.AutoHeadingID = true
@ -135,7 +135,9 @@ func (b *atxHeadingParser) Open(parent ast.Node, reader text.Reader, pc Context)
for _, attr := range attrs { for _, attr := range attrs {
node.SetAttribute(attr.Name, attr.Value) node.SetAttribute(attr.Name, attr.Value)
} }
node.Lines().Append(text.NewSegment(segment.Start+start+1-segment.Padding, segment.Start+closureOpen-segment.Padding)) node.Lines().Append(text.NewSegment(
segment.Start+start+1-segment.Padding,
segment.Start+closureOpen-segment.Padding))
} }
} }
} }

View file

@ -28,12 +28,13 @@ func (b *blockquoteParser) process(reader text.Reader) bool {
reader.Advance(pos) reader.Advance(pos)
return true return true
} }
if line[pos] == ' ' || line[pos] == '\t' {
pos++
}
reader.Advance(pos) reader.Advance(pos)
if line[pos-1] == '\t' { if line[pos] == ' ' || line[pos] == '\t' {
reader.SetPadding(2) padding := 0
if line[pos] == '\t' {
padding = util.TabWidth(reader.LineOffset()) - 1
}
reader.AdvanceAndSetPadding(1, padding)
} }
return true return true
} }

View file

@ -35,6 +35,7 @@ func (b *codeBlockParser) Open(parent ast.Node, reader text.Reader, pc Context)
if segment.Padding != 0 { if segment.Padding != 0 {
preserveLeadingTabInCodeBlock(&segment, reader, 0) preserveLeadingTabInCodeBlock(&segment, reader, 0)
} }
segment.ForceNewline = true
node.Lines().Append(segment) node.Lines().Append(segment)
reader.Advance(segment.Len() - 1) reader.Advance(segment.Len() - 1)
return node, NoChildren return node, NoChildren
@ -59,6 +60,7 @@ func (b *codeBlockParser) Continue(node ast.Node, reader text.Reader, pc Context
preserveLeadingTabInCodeBlock(&segment, reader, 0) preserveLeadingTabInCodeBlock(&segment, reader, 0)
} }
segment.ForceNewline = true
node.Lines().Append(segment) node.Lines().Append(segment)
reader.Advance(segment.Len() - 1) reader.Advance(segment.Len() - 1)
return Continue | NoChildren return Continue | NoChildren

View file

@ -66,12 +66,12 @@ func (d *Delimiter) Dump(source []byte, level int) {
var kindDelimiter = ast.NewNodeKind("Delimiter") var kindDelimiter = ast.NewNodeKind("Delimiter")
// Kind implements Node.Kind // Kind implements Node.Kind.
func (d *Delimiter) Kind() ast.NodeKind { func (d *Delimiter) Kind() ast.NodeKind {
return kindDelimiter return kindDelimiter
} }
// Text implements Node.Text // Text implements Node.Text.
func (d *Delimiter) Text(source []byte) []byte { func (d *Delimiter) Text(source []byte) []byte {
return d.Segment.Value(source) return d.Segment.Value(source)
} }
@ -126,7 +126,7 @@ func ScanDelimiter(line []byte, before rune, min int, processor DelimiterProcess
after = util.ToRune(line, j) after = util.ToRune(line, j)
} }
canOpen, canClose := false, false var canOpen, canClose bool
beforeIsPunctuation := util.IsPunctRune(before) beforeIsPunctuation := util.IsPunctRune(before)
beforeIsWhitespace := util.IsSpaceRune(before) beforeIsWhitespace := util.IsSpaceRune(before)
afterIsPunctuation := util.IsPunctRune(after) afterIsPunctuation := util.IsPunctRune(after)

View file

@ -100,6 +100,7 @@ func (b *fencedCodeBlockParser) Continue(node ast.Node, reader text.Reader, pc C
if padding != 0 { if padding != 0 {
preserveLeadingTabInCodeBlock(&seg, reader, fdata.indent) preserveLeadingTabInCodeBlock(&seg, reader, fdata.indent)
} }
seg.ForceNewline = true // EOF as newline
node.Lines().Append(seg) node.Lines().Append(seg)
reader.AdvanceAndSetPadding(segment.Stop-segment.Start-pos-1, padding) reader.AdvanceAndSetPadding(segment.Stop-segment.Start-pos-1, padding)
return Continue | NoChildren return Continue | NoChildren

View file

@ -61,8 +61,8 @@ var allowedBlockTags = map[string]bool{
"option": true, "option": true,
"p": true, "p": true,
"param": true, "param": true,
"search": true,
"section": true, "section": true,
"source": true,
"summary": true, "summary": true,
"table": true, "table": true,
"tbody": true, "tbody": true,
@ -76,7 +76,7 @@ var allowedBlockTags = map[string]bool{
"ul": true, "ul": true,
} }
var htmlBlockType1OpenRegexp = regexp.MustCompile(`(?i)^[ ]{0,3}<(script|pre|style|textarea)(?:\s.*|>.*|/>.*|)(?:\r\n|\n)?$`) var htmlBlockType1OpenRegexp = regexp.MustCompile(`(?i)^[ ]{0,3}<(script|pre|style|textarea)(?:\s.*|>.*|/>.*|)(?:\r\n|\n)?$`) //nolint:golint,lll
var htmlBlockType1CloseRegexp = regexp.MustCompile(`(?i)^.*</(?:script|pre|style|textarea)>.*`) var htmlBlockType1CloseRegexp = regexp.MustCompile(`(?i)^.*</(?:script|pre|style|textarea)>.*`)
var htmlBlockType2OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<!\-\-`) var htmlBlockType2OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<!\-\-`)
@ -91,9 +91,9 @@ var htmlBlockType4Close = []byte{'>'}
var htmlBlockType5OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<\!\[CDATA\[`) var htmlBlockType5OpenRegexp = regexp.MustCompile(`^[ ]{0,3}<\!\[CDATA\[`)
var htmlBlockType5Close = []byte{']', ']', '>'} var htmlBlockType5Close = []byte{']', ']', '>'}
var htmlBlockType6Regexp = regexp.MustCompile(`^[ ]{0,3}<(?:/[ ]*)?([a-zA-Z]+[a-zA-Z0-9\-]*)(?:[ ].*|>.*|/>.*|)(?:\r\n|\n)?$`) var htmlBlockType6Regexp = regexp.MustCompile(`^[ ]{0,3}<(?:/[ ]*)?([a-zA-Z]+[a-zA-Z0-9\-]*)(?:[ ].*|>.*|/>.*|)(?:\r\n|\n)?$`) //nolint:golint,lll
var htmlBlockType7Regexp = regexp.MustCompile(`^[ ]{0,3}<(/[ ]*)?([a-zA-Z]+[a-zA-Z0-9\-]*)(` + attributePattern + `*)[ ]*(?:>|/>)[ ]*(?:\r\n|\n)?$`) var htmlBlockType7Regexp = regexp.MustCompile(`^[ ]{0,3}<(/[ ]*)?([a-zA-Z]+[a-zA-Z0-9\-]*)(` + attributePattern + `*)[ ]*(?:>|/>)[ ]*(?:\r\n|\n)?$`) //nolint:golint,lll
type htmlBlockParser struct { type htmlBlockParser struct {
} }
@ -135,7 +135,8 @@ func (b *htmlBlockParser) Open(parent ast.Node, reader text.Reader, pc Context)
_, ok := allowedBlockTags[tagName] _, ok := allowedBlockTags[tagName]
if ok { if ok {
node = ast.NewHTMLBlock(ast.HTMLBlockType6) node = ast.NewHTMLBlock(ast.HTMLBlockType6)
} else if tagName != "script" && tagName != "style" && tagName != "pre" && !ast.IsParagraph(last) && !(isCloseTag && hasAttr) { // type 7 can not interrupt paragraph } else if tagName != "script" && tagName != "style" &&
tagName != "pre" && !ast.IsParagraph(last) && !(isCloseTag && hasAttr) { // type 7 can not interrupt paragraph
node = ast.NewHTMLBlock(ast.HTMLBlockType7) node = ast.NewHTMLBlock(ast.HTMLBlockType7)
} }
} }

View file

@ -126,13 +126,13 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
if line[0] == '!' { if line[0] == '!' {
if len(line) > 1 && line[1] == '[' { if len(line) > 1 && line[1] == '[' {
block.Advance(1) block.Advance(1)
pc.Set(linkBottom, pc.LastDelimiter()) pushLinkBottom(pc)
return processLinkLabelOpen(block, segment.Start+1, true, pc) return processLinkLabelOpen(block, segment.Start+1, true, pc)
} }
return nil return nil
} }
if line[0] == '[' { if line[0] == '[' {
pc.Set(linkBottom, pc.LastDelimiter()) pushLinkBottom(pc)
return processLinkLabelOpen(block, segment.Start, false, pc) return processLinkLabelOpen(block, segment.Start, false, pc)
} }
@ -143,6 +143,7 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
} }
last := tlist.(*linkLabelState).Last last := tlist.(*linkLabelState).Last
if last == nil { if last == nil {
_ = popLinkBottom(pc)
return nil return nil
} }
block.Advance(1) block.Advance(1)
@ -151,11 +152,13 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
// > A link label can have at most 999 characters inside the square brackets. // > A link label can have at most 999 characters inside the square brackets.
if linkLabelStateLength(tlist.(*linkLabelState)) > 998 { if linkLabelStateLength(tlist.(*linkLabelState)) > 998 {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment) ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil return nil
} }
if !last.IsImage && s.containsLink(last) { // a link in a link text is not allowed if !last.IsImage && s.containsLink(last) { // a link in a link text is not allowed
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment) ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil return nil
} }
@ -169,6 +172,7 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
link, hasValue = s.parseReferenceLink(parent, last, block, pc) link, hasValue = s.parseReferenceLink(parent, last, block, pc)
if link == nil && hasValue { if link == nil && hasValue {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment) ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil return nil
} }
} }
@ -182,12 +186,14 @@ func (s *linkParser) Parse(parent ast.Node, block text.Reader, pc Context) ast.N
// > A link label can have at most 999 characters inside the square brackets. // > A link label can have at most 999 characters inside the square brackets.
if len(maybeReference) > 999 { if len(maybeReference) > 999 {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment) ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil return nil
} }
ref, ok := pc.Reference(util.ToLinkReference(maybeReference)) ref, ok := pc.Reference(util.ToLinkReference(maybeReference))
if !ok { if !ok {
ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment) ast.MergeOrReplaceTextSegment(last.Parent(), last, last.Segment)
_ = popLinkBottom(pc)
return nil return nil
} }
link = ast.NewLink() link = ast.NewLink()
@ -230,11 +236,7 @@ func processLinkLabelOpen(block text.Reader, pos int, isImage bool, pc Context)
} }
func (s *linkParser) processLinkLabel(parent ast.Node, link *ast.Link, last *linkLabelState, pc Context) { func (s *linkParser) processLinkLabel(parent ast.Node, link *ast.Link, last *linkLabelState, pc Context) {
var bottom ast.Node bottom := popLinkBottom(pc)
if v := pc.Get(linkBottom); v != nil {
bottom = v.(ast.Node)
}
pc.Set(linkBottom, nil)
ProcessDelimiters(bottom, pc) ProcessDelimiters(bottom, pc)
for c := last.NextSibling(); c != nil; { for c := last.NextSibling(); c != nil; {
next := c.NextSibling() next := c.NextSibling()
@ -250,7 +252,8 @@ var linkFindClosureOptions text.FindClosureOptions = text.FindClosureOptions{
Advance: true, Advance: true,
} }
func (s *linkParser) parseReferenceLink(parent ast.Node, last *linkLabelState, block text.Reader, pc Context) (*ast.Link, bool) { func (s *linkParser) parseReferenceLink(parent ast.Node, last *linkLabelState,
block text.Reader, pc Context) (*ast.Link, bool) {
_, orgpos := block.Position() _, orgpos := block.Position()
block.Advance(1) // skip '[' block.Advance(1) // skip '['
segments, found := block.FindClosure('[', ']', linkFindClosureOptions) segments, found := block.FindClosure('[', ']', linkFindClosureOptions)
@ -394,6 +397,43 @@ func parseLinkTitle(block text.Reader) ([]byte, bool) {
return nil, false return nil, false
} }
func pushLinkBottom(pc Context) {
bottoms := pc.Get(linkBottom)
b := pc.LastDelimiter()
if bottoms == nil {
pc.Set(linkBottom, b)
return
}
if s, ok := bottoms.([]ast.Node); ok {
pc.Set(linkBottom, append(s, b))
return
}
pc.Set(linkBottom, []ast.Node{bottoms.(ast.Node), b})
}
func popLinkBottom(pc Context) ast.Node {
bottoms := pc.Get(linkBottom)
if bottoms == nil {
return nil
}
if v, ok := bottoms.(ast.Node); ok {
pc.Set(linkBottom, nil)
return v
}
s := bottoms.([]ast.Node)
v := s[len(s)-1]
n := s[0 : len(s)-1]
switch len(n) {
case 0:
pc.Set(linkBottom, nil)
case 1:
pc.Set(linkBottom, n[0])
default:
pc.Set(linkBottom, s[0:len(s)-1])
}
return v
}
func (s *linkParser) CloseBlock(parent ast.Node, block text.Reader, pc Context) { func (s *linkParser) CloseBlock(parent ast.Node, block text.Reader, pc Context) {
pc.Set(linkBottom, nil) pc.Set(linkBottom, nil)
tlist := pc.Get(linkLabelStateKey) tlist := pc.Get(linkLabelStateKey)

View file

@ -22,7 +22,7 @@ var listItemFlagValue interface{} = true
// Same as // Same as
// `^(([ ]*)([\-\*\+]))(\s+.*)?\n?$`.FindSubmatchIndex or // `^(([ ]*)([\-\*\+]))(\s+.*)?\n?$`.FindSubmatchIndex or
// `^(([ ]*)(\d{1,9}[\.\)]))(\s+.*)?\n?$`.FindSubmatchIndex // `^(([ ]*)(\d{1,9}[\.\)]))(\s+.*)?\n?$`.FindSubmatchIndex.
func parseListItem(line []byte) ([6]int, listItemType) { func parseListItem(line []byte) ([6]int, listItemType) {
i := 0 i := 0
l := len(line) l := len(line)
@ -89,7 +89,7 @@ func matchesListItem(source []byte, strict bool) ([6]int, listItemType) {
} }
func calcListOffset(source []byte, match [6]int) int { func calcListOffset(source []byte, match [6]int) int {
offset := 0 var offset int
if match[4] < 0 || util.IsBlank(source[match[4]:]) { // list item starts with a blank line if match[4] < 0 || util.IsBlank(source[match[4]:]) { // list item starts with a blank line
offset = 1 offset = 1
} else { } else {
@ -250,14 +250,14 @@ func (b *listParser) Close(node ast.Node, reader text.Reader, pc Context) {
for c := node.FirstChild(); c != nil && list.IsTight; c = c.NextSibling() { for c := node.FirstChild(); c != nil && list.IsTight; c = c.NextSibling() {
if c.FirstChild() != nil && c.FirstChild() != c.LastChild() { if c.FirstChild() != nil && c.FirstChild() != c.LastChild() {
for c1 := c.FirstChild().NextSibling(); c1 != nil; c1 = c1.NextSibling() { for c1 := c.FirstChild().NextSibling(); c1 != nil; c1 = c1.NextSibling() {
if bl, ok := c1.(ast.Node); ok && bl.HasBlankPreviousLines() { if c1.HasBlankPreviousLines() {
list.IsTight = false list.IsTight = false
break break
} }
} }
} }
if c != node.FirstChild() { if c != node.FirstChild() {
if bl, ok := c.(ast.Node); ok && bl.HasBlankPreviousLines() { if c.HasBlankPreviousLines() {
list.IsTight = false list.IsTight = false
} }
} }

View file

@ -58,7 +58,7 @@ func (b *listItemParser) Continue(node ast.Node, reader text.Reader, pc Context)
} }
offset := lastOffset(node.Parent()) offset := lastOffset(node.Parent())
isEmpty := node.ChildCount() == 0 isEmpty := node.ChildCount() == 0 && pc.Get(emptyListItemWithBlankLines) != nil
indent, _ := util.IndentWidth(line, reader.LineOffset()) indent, _ := util.IndentWidth(line, reader.LineOffset())
if (isEmpty || indent < offset) && indent < 4 { if (isEmpty || indent < offset) && indent < 4 {
_, typ := matchesListItem(line, true) _, typ := matchesListItem(line, true)

View file

@ -403,7 +403,8 @@ func (p *parseContext) IsInLinkLabel() bool {
type State int type State int
const ( const (
none State = 1 << iota // None is a default value of the [State].
None State = 1 << iota
// Continue indicates parser can continue parsing. // Continue indicates parser can continue parsing.
Continue Continue
@ -881,6 +882,7 @@ func (p *parser) Parse(reader text.Reader, opts ...ParseOption) ast.Node {
for _, at := range p.astTransformers { for _, at := range p.astTransformers {
at.Transform(root, reader, pc) at.Transform(root, reader, pc)
} }
// root.Dump(reader.Source(), 0) // root.Dump(reader.Source(), 0)
return root return root
} }
@ -1049,7 +1051,7 @@ func isBlankLine(lineNum, level int, stats []lineStat) bool {
func (p *parser) parseBlocks(parent ast.Node, reader text.Reader, pc Context) { func (p *parser) parseBlocks(parent ast.Node, reader text.Reader, pc Context) {
pc.SetOpenedBlocks([]Block{}) pc.SetOpenedBlocks([]Block{})
blankLines := make([]lineStat, 0, 128) blankLines := make([]lineStat, 0, 128)
isBlank := false var isBlank bool
for { // process blocks separated by blank lines for { // process blocks separated by blank lines
_, lines, ok := reader.SkipBlankLines() _, lines, ok := reader.SkipBlankLines()
if !ok { if !ok {
@ -1152,18 +1154,23 @@ func (p *parser) parseBlock(block text.BlockReader, parent ast.Node, pc Context)
break break
} }
lineLength := len(line) lineLength := len(line)
var lineBreakFlags uint8 = 0 var lineBreakFlags uint8
hasNewLine := line[lineLength-1] == '\n' hasNewLine := line[lineLength-1] == '\n'
if ((lineLength >= 3 && line[lineLength-2] == '\\' && line[lineLength-3] != '\\') || (lineLength == 2 && line[lineLength-2] == '\\')) && hasNewLine { // ends with \\n if ((lineLength >= 3 && line[lineLength-2] == '\\' &&
line[lineLength-3] != '\\') || (lineLength == 2 && line[lineLength-2] == '\\')) && hasNewLine { // ends with \\n
lineLength -= 2 lineLength -= 2
lineBreakFlags |= lineBreakHard | lineBreakVisible lineBreakFlags |= lineBreakHard | lineBreakVisible
} else if ((lineLength >= 4 && line[lineLength-3] == '\\' && line[lineLength-2] == '\r' && line[lineLength-4] != '\\') || (lineLength == 3 && line[lineLength-3] == '\\' && line[lineLength-2] == '\r')) && hasNewLine { // ends with \\r\n } else if ((lineLength >= 4 && line[lineLength-3] == '\\' && line[lineLength-2] == '\r' &&
line[lineLength-4] != '\\') || (lineLength == 3 && line[lineLength-3] == '\\' && line[lineLength-2] == '\r')) &&
hasNewLine { // ends with \\r\n
lineLength -= 3 lineLength -= 3
lineBreakFlags |= lineBreakHard | lineBreakVisible lineBreakFlags |= lineBreakHard | lineBreakVisible
} else if lineLength >= 3 && line[lineLength-3] == ' ' && line[lineLength-2] == ' ' && hasNewLine { // ends with [space][space]\n } else if lineLength >= 3 && line[lineLength-3] == ' ' && line[lineLength-2] == ' ' &&
hasNewLine { // ends with [space][space]\n
lineLength -= 3 lineLength -= 3
lineBreakFlags |= lineBreakHard lineBreakFlags |= lineBreakHard
} else if lineLength >= 4 && line[lineLength-4] == ' ' && line[lineLength-3] == ' ' && line[lineLength-2] == '\r' && hasNewLine { // ends with [space][space]\r\n } else if lineLength >= 4 && line[lineLength-4] == ' ' && line[lineLength-3] == ' ' &&
line[lineLength-2] == '\r' && hasNewLine { // ends with [space][space]\r\n
lineLength -= 4 lineLength -= 4
lineBreakFlags |= lineBreakHard lineBreakFlags |= lineBreakHard
} else if hasNewLine { } else if hasNewLine {
@ -1250,4 +1257,5 @@ func (p *parser) parseBlock(block text.BlockReader, parent ast.Node, pc Context)
for _, ip := range p.closeBlockers { for _, ip := range p.closeBlockers {
ip.CloseBlock(parent, block, pc) ip.CloseBlock(parent, block, pc)
} }
} }

View file

@ -15,7 +15,7 @@ type rawHTMLParser struct {
var defaultRawHTMLParser = &rawHTMLParser{} var defaultRawHTMLParser = &rawHTMLParser{}
// NewRawHTMLParser return a new InlineParser that can parse // NewRawHTMLParser return a new InlineParser that can parse
// inline htmls // inline htmls.
func NewRawHTMLParser() InlineParser { func NewRawHTMLParser() InlineParser {
return defaultRawHTMLParser return defaultRawHTMLParser
} }
@ -49,7 +49,7 @@ func (s *rawHTMLParser) Parse(parent ast.Node, block text.Reader, pc Context) as
var tagnamePattern = `([A-Za-z][A-Za-z0-9-]*)` var tagnamePattern = `([A-Za-z][A-Za-z0-9-]*)`
var spaceOrOneNewline = `(?:[ \t]|(?:\r\n|\n){0,1})` var spaceOrOneNewline = `(?:[ \t]|(?:\r\n|\n){0,1})`
var attributePattern = `(?:[\r\n \t]+[a-zA-Z_:][a-zA-Z0-9:._-]*(?:[\r\n \t]*=[\r\n \t]*(?:[^\"'=<>` + "`" + `\x00-\x20]+|'[^']*'|"[^"]*"))?)` var attributePattern = `(?:[\r\n \t]+[a-zA-Z_:][a-zA-Z0-9:._-]*(?:[\r\n \t]*=[\r\n \t]*(?:[^\"'=<>` + "`" + `\x00-\x20]+|'[^']*'|"[^"]*"))?)` //nolint:golint,lll
var openTagRegexp = regexp.MustCompile("^<" + tagnamePattern + attributePattern + `*` + spaceOrOneNewline + `*/?>`) var openTagRegexp = regexp.MustCompile("^<" + tagnamePattern + attributePattern + `*` + spaceOrOneNewline + `*/?>`)
var closeTagRegexp = regexp.MustCompile("^</" + tagnamePattern + spaceOrOneNewline + `*>`) var closeTagRegexp = regexp.MustCompile("^</" + tagnamePattern + spaceOrOneNewline + `*>`)
@ -58,47 +58,38 @@ var closeProcessingInstruction = []byte("?>")
var openCDATA = []byte("<![CDATA[") var openCDATA = []byte("<![CDATA[")
var closeCDATA = []byte("]]>") var closeCDATA = []byte("]]>")
var closeDecl = []byte(">") var closeDecl = []byte(">")
var emptyComment = []byte("<!---->") var emptyComment1 = []byte("<!-->")
var invalidComment1 = []byte("<!-->") var emptyComment2 = []byte("<!--->")
var invalidComment2 = []byte("<!--->")
var openComment = []byte("<!--") var openComment = []byte("<!--")
var closeComment = []byte("-->") var closeComment = []byte("-->")
var doubleHyphen = []byte("--")
func (s *rawHTMLParser) parseComment(block text.Reader, pc Context) ast.Node { func (s *rawHTMLParser) parseComment(block text.Reader, pc Context) ast.Node {
savedLine, savedSegment := block.Position() savedLine, savedSegment := block.Position()
node := ast.NewRawHTML() node := ast.NewRawHTML()
line, segment := block.PeekLine() line, segment := block.PeekLine()
if bytes.HasPrefix(line, emptyComment) { if bytes.HasPrefix(line, emptyComment1) {
node.Segments.Append(segment.WithStop(segment.Start + len(emptyComment))) node.Segments.Append(segment.WithStop(segment.Start + len(emptyComment1)))
block.Advance(len(emptyComment)) block.Advance(len(emptyComment1))
return node return node
} }
if bytes.HasPrefix(line, invalidComment1) || bytes.HasPrefix(line, invalidComment2) { if bytes.HasPrefix(line, emptyComment2) {
return nil node.Segments.Append(segment.WithStop(segment.Start + len(emptyComment2)))
block.Advance(len(emptyComment2))
return node
} }
offset := len(openComment) offset := len(openComment)
line = line[offset:] line = line[offset:]
for { for {
hindex := bytes.Index(line, doubleHyphen) index := bytes.Index(line, closeComment)
if hindex > -1 { if index > -1 {
hindex += offset node.Segments.Append(segment.WithStop(segment.Start + offset + index + len(closeComment)))
} block.Advance(offset + index + len(closeComment))
index := bytes.Index(line, closeComment) + offset
if index > -1 && hindex == index {
if index == 0 || len(line) < 2 || line[index-offset-1] != '-' {
node.Segments.Append(segment.WithStop(segment.Start + index + len(closeComment)))
block.Advance(index + len(closeComment))
return node return node
} }
} offset = 0
if hindex > 0 {
break
}
node.Segments.Append(segment) node.Segments.Append(segment)
block.AdvanceLine() block.AdvanceLine()
line, segment = block.PeekLine() line, segment = block.PeekLine()
offset = 0
if line == nil { if line == nil {
break break
} }
@ -153,9 +144,8 @@ func (s *rawHTMLParser) parseMultiLineRegexp(reg *regexp.Regexp, block text.Read
if l == eline { if l == eline {
block.Advance(end - start) block.Advance(end - start)
break break
} else {
block.AdvanceLine()
} }
block.AdvanceLine()
} }
return node return node
} }

View file

@ -91,7 +91,7 @@ func (b *setextHeadingParser) Close(node ast.Node, reader text.Reader, pc Contex
para.Lines().Append(segment) para.Lines().Append(segment)
heading.Parent().InsertAfter(heading.Parent(), heading, para) heading.Parent().InsertAfter(heading.Parent(), heading, para)
} else { } else {
next.(ast.Node).Lines().Unshift(segment) next.Lines().Unshift(segment)
} }
heading.Parent().RemoveChild(heading.Parent(), heading) heading.Parent().RemoveChild(heading.Parent(), heading)
} else { } else {

View file

@ -1,9 +1,11 @@
// Package html implements renderer that outputs HTMLs.
package html package html
import ( import (
"bytes" "bytes"
"fmt" "fmt"
"strconv" "strconv"
"unicode"
"unicode/utf8" "unicode/utf8"
"github.com/yuin/goldmark/ast" "github.com/yuin/goldmark/ast"
@ -15,7 +17,7 @@ import (
type Config struct { type Config struct {
Writer Writer Writer Writer
HardWraps bool HardWraps bool
EastAsianLineBreaks bool EastAsianLineBreaks EastAsianLineBreaks
XHTML bool XHTML bool
Unsafe bool Unsafe bool
} }
@ -25,7 +27,7 @@ func NewConfig() Config {
return Config{ return Config{
Writer: DefaultWriter, Writer: DefaultWriter,
HardWraps: false, HardWraps: false,
EastAsianLineBreaks: false, EastAsianLineBreaks: EastAsianLineBreaksNone,
XHTML: false, XHTML: false,
Unsafe: false, Unsafe: false,
} }
@ -37,7 +39,7 @@ func (c *Config) SetOption(name renderer.OptionName, value interface{}) {
case optHardWraps: case optHardWraps:
c.HardWraps = value.(bool) c.HardWraps = value.(bool)
case optEastAsianLineBreaks: case optEastAsianLineBreaks:
c.EastAsianLineBreaks = value.(bool) c.EastAsianLineBreaks = value.(EastAsianLineBreaks)
case optXHTML: case optXHTML:
c.XHTML = value.(bool) c.XHTML = value.(bool)
case optUnsafe: case optUnsafe:
@ -102,24 +104,94 @@ func WithHardWraps() interface {
// EastAsianLineBreaks is an option name used in WithEastAsianLineBreaks. // EastAsianLineBreaks is an option name used in WithEastAsianLineBreaks.
const optEastAsianLineBreaks renderer.OptionName = "EastAsianLineBreaks" const optEastAsianLineBreaks renderer.OptionName = "EastAsianLineBreaks"
// A EastAsianLineBreaks is a style of east asian line breaks.
type EastAsianLineBreaks int
const (
//EastAsianLineBreaksNone renders line breaks as it is.
EastAsianLineBreaksNone EastAsianLineBreaks = iota
// EastAsianLineBreaksSimple follows east_asian_line_breaks in Pandoc.
EastAsianLineBreaksSimple
// EastAsianLineBreaksCSS3Draft follows CSS text level3 "Segment Break Transformation Rules" with some enhancements.
EastAsianLineBreaksCSS3Draft
)
func (b EastAsianLineBreaks) softLineBreak(thisLastRune rune, siblingFirstRune rune) bool {
switch b {
case EastAsianLineBreaksNone:
return false
case EastAsianLineBreaksSimple:
return !(util.IsEastAsianWideRune(thisLastRune) && util.IsEastAsianWideRune(siblingFirstRune))
case EastAsianLineBreaksCSS3Draft:
return eastAsianLineBreaksCSS3DraftSoftLineBreak(thisLastRune, siblingFirstRune)
}
return false
}
func eastAsianLineBreaksCSS3DraftSoftLineBreak(thisLastRune rune, siblingFirstRune rune) bool {
// Implements CSS text level3 Segment Break Transformation Rules with some enhancements.
// References:
// - https://www.w3.org/TR/2020/WD-css-text-3-20200429/#line-break-transform
// - https://github.com/w3c/csswg-drafts/issues/5086
// Rule1:
// If the character immediately before or immediately after the segment break is
// the zero-width space character (U+200B), then the break is removed, leaving behind the zero-width space.
if thisLastRune == '\u200B' || siblingFirstRune == '\u200B' {
return false
}
// Rule2:
// Otherwise, if the East Asian Width property of both the character before and after the segment break is
// F, W, or H (not A), and neither side is Hangul, then the segment break is removed.
thisLastRuneEastAsianWidth := util.EastAsianWidth(thisLastRune)
siblingFirstRuneEastAsianWidth := util.EastAsianWidth(siblingFirstRune)
if (thisLastRuneEastAsianWidth == "F" ||
thisLastRuneEastAsianWidth == "W" ||
thisLastRuneEastAsianWidth == "H") &&
(siblingFirstRuneEastAsianWidth == "F" ||
siblingFirstRuneEastAsianWidth == "W" ||
siblingFirstRuneEastAsianWidth == "H") {
return unicode.Is(unicode.Hangul, thisLastRune) || unicode.Is(unicode.Hangul, siblingFirstRune)
}
// Rule3:
// Otherwise, if either the character before or after the segment break belongs to
// the space-discarding character set and it is a Unicode Punctuation (P*) or U+3000,
// then the segment break is removed.
if util.IsSpaceDiscardingUnicodeRune(thisLastRune) ||
unicode.IsPunct(thisLastRune) ||
thisLastRune == '\u3000' ||
util.IsSpaceDiscardingUnicodeRune(siblingFirstRune) ||
unicode.IsPunct(siblingFirstRune) ||
siblingFirstRune == '\u3000' {
return false
}
// Rule4:
// Otherwise, the segment break is converted to a space (U+0020).
return true
}
type withEastAsianLineBreaks struct { type withEastAsianLineBreaks struct {
eastAsianLineBreaksStyle EastAsianLineBreaks
} }
func (o *withEastAsianLineBreaks) SetConfig(c *renderer.Config) { func (o *withEastAsianLineBreaks) SetConfig(c *renderer.Config) {
c.Options[optEastAsianLineBreaks] = true c.Options[optEastAsianLineBreaks] = o.eastAsianLineBreaksStyle
} }
func (o *withEastAsianLineBreaks) SetHTMLOption(c *Config) { func (o *withEastAsianLineBreaks) SetHTMLOption(c *Config) {
c.EastAsianLineBreaks = true c.EastAsianLineBreaks = o.eastAsianLineBreaksStyle
} }
// WithEastAsianLineBreaks is a functional option that indicates whether softline breaks // WithEastAsianLineBreaks is a functional option that indicates whether softline breaks
// between east asian wide characters should be ignored. // between east asian wide characters should be ignored.
func WithEastAsianLineBreaks() interface { func WithEastAsianLineBreaks(e EastAsianLineBreaks) interface {
renderer.Option renderer.Option
Option Option
} { } {
return &withEastAsianLineBreaks{} return &withEastAsianLineBreaks{e}
} }
// XHTML is an option name used in WithXHTML. // XHTML is an option name used in WithXHTML.
@ -253,15 +325,17 @@ var GlobalAttributeFilter = util.NewBytesFilter(
[]byte("translate"), []byte("translate"),
) )
func (r *Renderer) renderDocument(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderDocument(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
// nothing to do // nothing to do
return ast.WalkContinue, nil return ast.WalkContinue, nil
} }
// HeadingAttributeFilter defines attribute names which heading elements can have // HeadingAttributeFilter defines attribute names which heading elements can have.
var HeadingAttributeFilter = GlobalAttributeFilter var HeadingAttributeFilter = GlobalAttributeFilter
func (r *Renderer) renderHeading(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderHeading(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.Heading) n := node.(*ast.Heading)
if entering { if entering {
_, _ = w.WriteString("<h") _, _ = w.WriteString("<h")
@ -278,12 +352,13 @@ func (r *Renderer) renderHeading(w util.BufWriter, source []byte, node ast.Node,
return ast.WalkContinue, nil return ast.WalkContinue, nil
} }
// BlockquoteAttributeFilter defines attribute names which blockquote elements can have // BlockquoteAttributeFilter defines attribute names which blockquote elements can have.
var BlockquoteAttributeFilter = GlobalAttributeFilter.Extend( var BlockquoteAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("cite"), []byte("cite"),
) )
func (r *Renderer) renderBlockquote(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderBlockquote(
w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if entering { if entering {
if n.Attributes() != nil { if n.Attributes() != nil {
_, _ = w.WriteString("<blockquote") _, _ = w.WriteString("<blockquote")
@ -308,7 +383,8 @@ func (r *Renderer) renderCodeBlock(w util.BufWriter, source []byte, n ast.Node,
return ast.WalkContinue, nil return ast.WalkContinue, nil
} }
func (r *Renderer) renderFencedCodeBlock(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderFencedCodeBlock(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.FencedCodeBlock) n := node.(*ast.FencedCodeBlock)
if entering { if entering {
_, _ = w.WriteString("<pre><code") _, _ = w.WriteString("<pre><code")
@ -326,7 +402,8 @@ func (r *Renderer) renderFencedCodeBlock(w util.BufWriter, source []byte, node a
return ast.WalkContinue, nil return ast.WalkContinue, nil
} }
func (r *Renderer) renderHTMLBlock(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderHTMLBlock(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.HTMLBlock) n := node.(*ast.HTMLBlock)
if entering { if entering {
if r.Unsafe { if r.Unsafe {
@ -368,7 +445,7 @@ func (r *Renderer) renderList(w util.BufWriter, source []byte, node ast.Node, en
_ = w.WriteByte('<') _ = w.WriteByte('<')
_, _ = w.WriteString(tag) _, _ = w.WriteString(tag)
if n.IsOrdered() && n.Start != 1 { if n.IsOrdered() && n.Start != 1 {
fmt.Fprintf(w, " start=\"%d\"", n.Start) _, _ = fmt.Fprintf(w, " start=\"%d\"", n.Start)
} }
if n.Attributes() != nil { if n.Attributes() != nil {
RenderAttributes(w, n, ListAttributeFilter) RenderAttributes(w, n, ListAttributeFilter)
@ -428,7 +505,7 @@ func (r *Renderer) renderParagraph(w util.BufWriter, source []byte, n ast.Node,
func (r *Renderer) renderTextBlock(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderTextBlock(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if !entering { if !entering {
if _, ok := n.NextSibling().(ast.Node); ok && n.FirstChild() != nil { if n.NextSibling() != nil && n.FirstChild() != nil {
_ = w.WriteByte('\n') _ = w.WriteByte('\n')
} }
} }
@ -444,7 +521,8 @@ var ThematicAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("width"), // [Deprecated] []byte("width"), // [Deprecated]
) )
func (r *Renderer) renderThematicBreak(w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderThematicBreak(
w util.BufWriter, source []byte, n ast.Node, entering bool) (ast.WalkStatus, error) {
if !entering { if !entering {
return ast.WalkContinue, nil return ast.WalkContinue, nil
} }
@ -473,7 +551,8 @@ var LinkAttributeFilter = GlobalAttributeFilter.Extend(
[]byte("target"), []byte("target"),
) )
func (r *Renderer) renderAutoLink(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderAutoLink(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.AutoLink) n := node.(*ast.AutoLink)
if !entering { if !entering {
return ast.WalkContinue, nil return ast.WalkContinue, nil
@ -528,7 +607,8 @@ func (r *Renderer) renderCodeSpan(w util.BufWriter, source []byte, n ast.Node, e
// EmphasisAttributeFilter defines attribute names which emphasis elements can have. // EmphasisAttributeFilter defines attribute names which emphasis elements can have.
var EmphasisAttributeFilter = GlobalAttributeFilter var EmphasisAttributeFilter = GlobalAttributeFilter
func (r *Renderer) renderEmphasis(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderEmphasis(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
n := node.(*ast.Emphasis) n := node.(*ast.Emphasis)
tag := "em" tag := "em"
if n.Level == 2 { if n.Level == 2 {
@ -600,7 +680,7 @@ func (r *Renderer) renderImage(w util.BufWriter, source []byte, node ast.Node, e
_, _ = w.Write(util.EscapeHTML(util.URLEscape(n.Destination, true))) _, _ = w.Write(util.EscapeHTML(util.URLEscape(n.Destination, true)))
} }
_, _ = w.WriteString(`" alt="`) _, _ = w.WriteString(`" alt="`)
_, _ = w.Write(nodeToHTMLText(n, source)) r.renderTexts(w, source, n)
_ = w.WriteByte('"') _ = w.WriteByte('"')
if n.Title != nil { if n.Title != nil {
_, _ = w.WriteString(` title="`) _, _ = w.WriteString(` title="`)
@ -618,7 +698,8 @@ func (r *Renderer) renderImage(w util.BufWriter, source []byte, node ast.Node, e
return ast.WalkSkipChildren, nil return ast.WalkSkipChildren, nil
} }
func (r *Renderer) renderRawHTML(w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) { func (r *Renderer) renderRawHTML(
w util.BufWriter, source []byte, node ast.Node, entering bool) (ast.WalkStatus, error) {
if !entering { if !entering {
return ast.WalkSkipChildren, nil return ast.WalkSkipChildren, nil
} }
@ -653,14 +734,13 @@ func (r *Renderer) renderText(w util.BufWriter, source []byte, node ast.Node, en
_, _ = w.WriteString("<br>\n") _, _ = w.WriteString("<br>\n")
} }
} else if n.SoftLineBreak() { } else if n.SoftLineBreak() {
if r.EastAsianLineBreaks && len(value) != 0 { if r.EastAsianLineBreaks != EastAsianLineBreaksNone && len(value) != 0 {
sibling := node.NextSibling() sibling := node.NextSibling()
if sibling != nil && sibling.Kind() == ast.KindText { if sibling != nil && sibling.Kind() == ast.KindText {
if siblingText := sibling.(*ast.Text).Text(source); len(siblingText) != 0 { if siblingText := sibling.(*ast.Text).Value(source); len(siblingText) != 0 {
thisLastRune := util.ToRune(value, len(value)-1) thisLastRune := util.ToRune(value, len(value)-1)
siblingFirstRune, _ := utf8.DecodeRune(siblingText) siblingFirstRune, _ := utf8.DecodeRune(siblingText)
if !(util.IsEastAsianWideRune(thisLastRune) && if r.EastAsianLineBreaks.softLineBreak(thisLastRune, siblingFirstRune) {
util.IsEastAsianWideRune(siblingFirstRune)) {
_ = w.WriteByte('\n') _ = w.WriteByte('\n')
} }
} }
@ -690,6 +770,18 @@ func (r *Renderer) renderString(w util.BufWriter, source []byte, node ast.Node,
return ast.WalkContinue, nil return ast.WalkContinue, nil
} }
func (r *Renderer) renderTexts(w util.BufWriter, source []byte, n ast.Node) {
for c := n.FirstChild(); c != nil; c = c.NextSibling() {
if s, ok := c.(*ast.String); ok {
_, _ = r.renderString(w, source, s, true)
} else if t, ok := c.(*ast.Text); ok {
_, _ = r.renderText(w, source, t, true)
} else {
r.renderTexts(w, source, c)
}
}
}
var dataPrefix = []byte("data-") var dataPrefix = []byte("data-")
// RenderAttributes renders given node's attributes. // RenderAttributes renders given node's attributes.
@ -706,7 +798,14 @@ func RenderAttributes(w util.BufWriter, node ast.Node, filter util.BytesFilter)
_, _ = w.Write(attr.Name) _, _ = w.Write(attr.Name)
_, _ = w.WriteString(`="`) _, _ = w.WriteString(`="`)
// TODO: convert numeric values to strings // TODO: convert numeric values to strings
_, _ = w.Write(util.EscapeHTML(attr.Value.([]byte))) var value []byte
switch typed := attr.Value.(type) {
case []byte:
value = typed
case string:
value = util.StringToReadOnlyBytes(typed)
}
_, _ = w.Write(util.EscapeHTML(value))
_ = w.WriteByte('"') _ = w.WriteByte('"')
} }
} }
@ -920,17 +1019,3 @@ func IsDangerousURL(url []byte) bool {
return hasPrefix(url, bJs) || hasPrefix(url, bVb) || return hasPrefix(url, bJs) || hasPrefix(url, bVb) ||
hasPrefix(url, bFile) || hasPrefix(url, bData) hasPrefix(url, bFile) || hasPrefix(url, bData)
} }
func nodeToHTMLText(n ast.Node, source []byte) []byte {
var buf bytes.Buffer
for c := n.FirstChild(); c != nil; c = c.NextSibling() {
if s, ok := c.(*ast.String); ok && s.IsCode() {
buf.Write(s.Text(source))
} else if !c.HasChildren() {
buf.Write(util.EscapeHTML(c.Text(source)))
} else {
buf.Write(nodeToHTMLText(c, source))
}
}
return buf.Bytes()
}

View file

@ -16,7 +16,7 @@ type Config struct {
NodeRenderers util.PrioritizedSlice NodeRenderers util.PrioritizedSlice
} }
// NewConfig returns a new Config // NewConfig returns a new Config.
func NewConfig() *Config { func NewConfig() *Config {
return &Config{ return &Config{
Options: map[OptionName]interface{}{}, Options: map[OptionName]interface{}{},
@ -78,7 +78,7 @@ type NodeRenderer interface {
RegisterFuncs(NodeRendererFuncRegisterer) RegisterFuncs(NodeRendererFuncRegisterer)
} }
// A NodeRendererFuncRegisterer registers // A NodeRendererFuncRegisterer registers given NodeRendererFunc to this object.
type NodeRendererFuncRegisterer interface { type NodeRendererFuncRegisterer interface {
// Register registers given NodeRendererFunc to this object. // Register registers given NodeRendererFunc to this object.
Register(ast.NodeKind, NodeRendererFunc) Register(ast.NodeKind, NodeRendererFunc)

View file

@ -1,3 +1,4 @@
// Package testutil provides utilities for unit tests.
package testutil package testutil
import ( import (
@ -65,7 +66,7 @@ type MarkdownTestCaseOptions struct {
const attributeSeparator = "//- - - - - - - - -//" const attributeSeparator = "//- - - - - - - - -//"
const caseSeparator = "//= = = = = = = = = = = = = = = = = = = = = = = =//" const caseSeparator = "//= = = = = = = = = = = = = = = = = = = = = = = =//"
var optionsRegexp *regexp.Regexp = regexp.MustCompile(`(?i)\s*options:(.*)`) var optionsRegexp = regexp.MustCompile(`(?i)\s*options:(.*)`)
// ParseCliCaseArg parses -case command line args. // ParseCliCaseArg parses -case command line args.
func ParseCliCaseArg() []int { func ParseCliCaseArg() []int {
@ -90,7 +91,9 @@ func DoTestCaseFile(m goldmark.Markdown, filename string, t TestingT, no ...int)
if err != nil { if err != nil {
panic(err) panic(err)
} }
defer fp.Close() defer func() {
_ = fp.Close()
}()
scanner := bufio.NewScanner(fp) scanner := bufio.NewScanner(fp)
c := MarkdownTestCase{ c := MarkdownTestCase{

2
text/package.go Normal file
View file

@ -0,0 +1,2 @@
// Package text provides functionalities to manipulate texts.
package text

View file

@ -76,7 +76,7 @@ type Reader interface {
FindClosure(opener, closer byte, options FindClosureOptions) (*Segments, bool) FindClosure(opener, closer byte, options FindClosureOptions) (*Segments, bool)
} }
// FindClosureOptions is options for Reader.FindClosure // FindClosureOptions is options for Reader.FindClosure.
type FindClosureOptions struct { type FindClosureOptions struct {
// CodeSpan is a flag for the FindClosure. If this is set to true, // CodeSpan is a flag for the FindClosure. If this is set to true,
// FindClosure ignores closers in codespans. // FindClosure ignores closers in codespans.
@ -154,7 +154,7 @@ func (r *reader) PeekLine() ([]byte, Segment) {
return nil, r.pos return nil, r.pos
} }
// io.RuneReader interface // io.RuneReader interface.
func (r *reader) ReadRune() (rune, int, error) { func (r *reader) ReadRune() (rune, int, error) {
return readRuneReader(r) return readRuneReader(r)
} }
@ -354,7 +354,7 @@ func (r *blockReader) Value(seg Segment) []byte {
return ret return ret
} }
// io.RuneReader interface // io.RuneReader interface.
func (r *blockReader) ReadRune() (rune, int, error) { func (r *blockReader) ReadRune() (rune, int, error) {
return readRuneReader(r) return readRuneReader(r)
} }

View file

@ -2,6 +2,7 @@ package text
import ( import (
"bytes" "bytes"
"github.com/yuin/goldmark/util" "github.com/yuin/goldmark/util"
) )
@ -18,6 +19,20 @@ type Segment struct {
// Padding is a padding length of the segment. // Padding is a padding length of the segment.
Padding int Padding int
// ForceNewline is true if the segment should be ended with a newline.
// Some elements(i.e. CodeBlock, FencedCodeBlock) does not trim trailing
// newlines. Spec defines that EOF is treated as a newline, so we need to
// add a newline to the end of the segment if it is not empty.
//
// i.e.:
//
// ```go
// const test = "test"
//
// This code does not close the code block and ends with EOF. In this case,
// we need to add a newline to the end of the last line like `const test = "test"\n`.
ForceNewline bool
} }
// NewSegment return a new Segment. // NewSegment return a new Segment.
@ -40,12 +55,18 @@ func NewSegmentPadding(start, stop, n int) Segment {
// Value returns a value of the segment. // Value returns a value of the segment.
func (t *Segment) Value(buffer []byte) []byte { func (t *Segment) Value(buffer []byte) []byte {
var result []byte
if t.Padding == 0 { if t.Padding == 0 {
return buffer[t.Start:t.Stop] result = buffer[t.Start:t.Stop]
} } else {
result := make([]byte, 0, t.Padding+t.Stop-t.Start+1) result = make([]byte, 0, t.Padding+t.Stop-t.Start+1)
result = append(result, bytes.Repeat(space, t.Padding)...) result = append(result, bytes.Repeat(space, t.Padding)...)
return append(result, buffer[t.Start:t.Stop]...) result = append(result, buffer[t.Start:t.Stop]...)
}
if t.ForceNewline && len(result) > 0 && result[len(result)-1] != '\n' {
result = append(result, '\n')
}
return result
} }
// Len returns a length of the segment. // Len returns a length of the segment.
@ -207,3 +228,12 @@ func (s *Segments) Unshift(v Segment) {
s.values = append(s.values[0:1], s.values[0:]...) s.values = append(s.values[0:1], s.values[0:]...)
s.values[0] = v s.values[0] = v
} }
// Value returns a string value of the collection.
func (s *Segments) Value(buffer []byte) []byte {
var result []byte
for _, v := range s.values {
result = append(result, v.Value(buffer)...)
}
return result
}

View file

@ -1,5 +1,8 @@
//nolint:golint,lll,misspell
package util package util
import "sync"
// An HTML5Entity struct represents HTML5 entitites. // An HTML5Entity struct represents HTML5 entitites.
type HTML5Entity struct { type HTML5Entity struct {
Name string Name string
@ -8,13 +11,20 @@ type HTML5Entity struct {
} }
// LookUpHTML5EntityByName returns (an HTML5Entity, true) if an entity named // LookUpHTML5EntityByName returns (an HTML5Entity, true) if an entity named
// given name is found, otherwise (nil, false) // given name is found, otherwise (nil, false).
func LookUpHTML5EntityByName(name string) (*HTML5Entity, bool) { func LookUpHTML5EntityByName(name string) (*HTML5Entity, bool) {
v, ok := html5entities[name] v, ok := html5entities()[name]
return v, ok return v, ok
} }
var html5entities = map[string]*HTML5Entity{ var html5entitiesOnce sync.Once // TODO: uses sync.OnceValue for future
var _html5entities map[string]*HTML5Entity
func html5entities() map[string]*HTML5Entity {
html5entitiesOnce.Do(func() {
_html5entities =
map[string]*HTML5Entity{
"AElig": {Name: "AElig", CodePoints: []int{198}, Characters: []byte{0xc3, 0x86}}, "AElig": {Name: "AElig", CodePoints: []int{198}, Characters: []byte{0xc3, 0x86}},
"AMP": {Name: "AMP", CodePoints: []int{38}, Characters: []byte{0x26}}, "AMP": {Name: "AMP", CodePoints: []int{38}, Characters: []byte{0x26}},
"Aacute": {Name: "Aacute", CodePoints: []int{193}, Characters: []byte{0xc3, 0x81}}, "Aacute": {Name: "Aacute", CodePoints: []int{193}, Characters: []byte{0xc3, 0x81}},
@ -2139,4 +2149,7 @@ var html5entities = map[string]*HTML5Entity{
"zscr": {Name: "zscr", CodePoints: []int{120015}, Characters: []byte{0xf0, 0x9d, 0x93, 0x8f}}, "zscr": {Name: "zscr", CodePoints: []int{120015}, Characters: []byte{0xf0, 0x9d, 0x93, 0x8f}},
"zwj": {Name: "zwj", CodePoints: []int{8205}, Characters: []byte{0xe2, 0x80, 0x8d}}, "zwj": {Name: "zwj", CodePoints: []int{8205}, Characters: []byte{0xe2, 0x80, 0x8d}},
"zwnj": {Name: "zwnj", CodePoints: []int{8204}, Characters: []byte{0xe2, 0x80, 0x8c}}, "zwnj": {Name: "zwnj", CodePoints: []int{8204}, Characters: []byte{0xe2, 0x80, 0x8c}},
}
})
return _html5entities
} }

File diff suppressed because it is too large Load diff

View file

@ -63,12 +63,13 @@ func (b *CopyOnWriteBuffer) AppendString(value string) {
// WriteByte writes the given byte to the buffer. // WriteByte writes the given byte to the buffer.
// WriteByte allocate new buffer and clears it at the first time. // WriteByte allocate new buffer and clears it at the first time.
func (b *CopyOnWriteBuffer) WriteByte(c byte) { func (b *CopyOnWriteBuffer) WriteByte(c byte) error {
if !b.copied { if !b.copied {
b.buffer = make([]byte, 0, len(b.buffer)+20) b.buffer = make([]byte, 0, len(b.buffer)+20)
b.copied = true b.copied = true
} }
b.buffer = append(b.buffer, c) b.buffer = append(b.buffer, c)
return nil
} }
// AppendByte appends given bytes to the buffer. // AppendByte appends given bytes to the buffer.
@ -150,7 +151,7 @@ func TabWidth(currentPos int) int {
// width: 1234 5678 // width: 1234 5678
// //
// width=2 is in the tab character. In this case, IndentPosition returns // width=2 is in the tab character. In this case, IndentPosition returns
// (pos=1, padding=2) // (pos=1, padding=2).
func IndentPosition(bs []byte, currentPos, width int) (pos, padding int) { func IndentPosition(bs []byte, currentPos, width int) (pos, padding int) {
return IndentPositionPadding(bs, currentPos, 0, width) return IndentPositionPadding(bs, currentPos, 0, width)
} }
@ -165,7 +166,13 @@ func IndentPositionPadding(bs []byte, currentPos, paddingv, width int) (pos, pad
w := 0 w := 0
i := 0 i := 0
l := len(bs) l := len(bs)
p := paddingv
for ; i < l; i++ { for ; i < l; i++ {
if p > 0 {
p--
w++
continue
}
if bs[i] == '\t' && w < width { if bs[i] == '\t' && w < width {
w += TabWidth(currentPos + w) w += TabWidth(currentPos + w)
} else if bs[i] == ' ' && w < width { } else if bs[i] == ' ' && w < width {
@ -424,7 +431,7 @@ func DoFullUnicodeCaseFolding(v []byte) []byte {
if c >= 0x41 && c <= 0x5a { if c >= 0x41 && c <= 0x5a {
// A-Z to a-z // A-Z to a-z
cob.Write(v[n:i]) cob.Write(v[n:i])
cob.WriteByte(c + 32) _ = cob.WriteByte(c + 32)
n = i + 1 n = i + 1
} }
continue continue
@ -521,7 +528,7 @@ func ToLinkReference(v []byte) string {
return string(ReplaceSpaces(v, ' ')) return string(ReplaceSpaces(v, ' '))
} }
var htmlEscapeTable = [256][]byte{nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&quot;"), nil, nil, nil, []byte("&amp;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&lt;"), nil, []byte("&gt;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil} var htmlEscapeTable = [256][]byte{nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&quot;"), nil, nil, nil, []byte("&amp;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, []byte("&lt;"), nil, []byte("&gt;"), nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil, nil} //nolint:golint,lll
// EscapeHTMLByte returns HTML escaped bytes if the given byte should be escaped, // EscapeHTMLByte returns HTML escaped bytes if the given byte should be escaped,
// otherwise nil. // otherwise nil.
@ -557,7 +564,7 @@ func UnescapePunctuations(source []byte) []byte {
c := source[i] c := source[i]
if i < limit-1 && c == '\\' && IsPunct(source[i+1]) { if i < limit-1 && c == '\\' && IsPunct(source[i+1]) {
cob.Write(source[n:i]) cob.Write(source[n:i])
cob.WriteByte(source[i+1]) _ = cob.WriteByte(source[i+1])
i += 2 i += 2
n = i n = i
continue continue
@ -573,9 +580,9 @@ func UnescapePunctuations(source []byte) []byte {
// ResolveNumericReferences resolve numeric references like '&#1234;" . // ResolveNumericReferences resolve numeric references like '&#1234;" .
func ResolveNumericReferences(source []byte) []byte { func ResolveNumericReferences(source []byte) []byte {
cob := NewCopyOnWriteBuffer(source) cob := NewCopyOnWriteBuffer(source)
buf := make([]byte, 6, 6) buf := make([]byte, 6)
limit := len(source) limit := len(source)
ok := false var ok bool
n := 0 n := 0
for i := 0; i < limit; i++ { for i := 0; i < limit; i++ {
if source[i] == '&' { if source[i] == '&' {
@ -625,7 +632,7 @@ func ResolveNumericReferences(source []byte) []byte {
func ResolveEntityNames(source []byte) []byte { func ResolveEntityNames(source []byte) []byte {
cob := NewCopyOnWriteBuffer(source) cob := NewCopyOnWriteBuffer(source)
limit := len(source) limit := len(source)
ok := false var ok bool
n := 0 n := 0
for i := 0; i < limit; i++ { for i := 0; i < limit; i++ {
if source[i] == '&' { if source[i] == '&' {
@ -750,7 +757,7 @@ func FindURLIndex(b []byte) int {
return i return i
} }
var emailDomainRegexp = regexp.MustCompile(`^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*`) var emailDomainRegexp = regexp.MustCompile(`^[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*`) //nolint:golint,lll
// FindEmailIndex returns a stop index value if the given bytes seem an email address. // FindEmailIndex returns a stop index value if the given bytes seem an email address.
func FindEmailIndex(b []byte) int { func FindEmailIndex(b []byte) int {
@ -781,18 +788,19 @@ func FindEmailIndex(b []byte) int {
var spaces = []byte(" \t\n\x0b\x0c\x0d") var spaces = []byte(" \t\n\x0b\x0c\x0d")
var spaceTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} var spaceTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
var punctTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} var punctTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
// a-zA-Z0-9, ;/?:@&=+$,-_.!~*'()# // a-zA-Z0-9, ;/?:@&=+$,-_.!~*'()#
var urlEscapeTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}
var utf8lenTable = [256]int8{1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 99, 99, 99, 99, 99, 99, 99, 99} var urlEscapeTable = [256]int8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
var urlTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 5, 5, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 0, 1, 0, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1} var utf8lenTable = [256]int8{1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 99, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 99, 99, 99, 99, 99, 99, 99, 99} //nolint:golint,lll
var emailTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} var urlTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 5, 1, 5, 5, 1, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 1, 1, 0, 1, 0, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1} //nolint:golint,lll
var emailTable = [256]uint8{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0} //nolint:golint,lll
// UTF8Len returns a byte length of the utf-8 character. // UTF8Len returns a byte length of the utf-8 character.
func UTF8Len(b byte) int8 { func UTF8Len(b byte) int8 {
@ -806,7 +814,7 @@ func IsPunct(c byte) bool {
// IsPunctRune returns true if the given rune is a punctuation, otherwise false. // IsPunctRune returns true if the given rune is a punctuation, otherwise false.
func IsPunctRune(r rune) bool { func IsPunctRune(r rune) bool {
return int32(r) <= 256 && IsPunct(byte(r)) || unicode.IsPunct(r) return unicode.IsSymbol(r) || unicode.IsPunct(r)
} }
// IsSpace returns true if the given character is a space, otherwise false. // IsSpace returns true if the given character is a space, otherwise false.
@ -834,15 +842,6 @@ func IsAlphaNumeric(c byte) bool {
return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z' || c >= '0' && c <= '9' return c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z' || c >= '0' && c <= '9'
} }
// IsEastAsianWideRune returns trhe if the given rune is an east asian wide character, otherwise false.
func IsEastAsianWideRune(r rune) bool {
return unicode.Is(unicode.Hiragana, r) ||
unicode.Is(unicode.Katakana, r) ||
unicode.Is(unicode.Han, r) ||
unicode.Is(unicode.Lm, r) ||
unicode.Is(unicode.Hangul, r)
}
// A BufWriter is a subset of the bufio.Writer . // A BufWriter is a subset of the bufio.Writer .
type BufWriter interface { type BufWriter interface {
io.Writer io.Writer
@ -862,7 +861,7 @@ type PrioritizedValue struct {
Priority int Priority int
} }
// PrioritizedSlice is a slice of the PrioritizedValues // PrioritizedSlice is a slice of the PrioritizedValues.
type PrioritizedSlice []PrioritizedValue type PrioritizedSlice []PrioritizedValue
// Sort sorts the PrioritizedSlice in ascending order. // Sort sorts the PrioritizedSlice in ascending order.
@ -977,7 +976,7 @@ func (s *bytesFilter) Contains(b []byte) bool {
} }
h := bytesHash(b) % uint64(len(s.slots)) h := bytesHash(b) % uint64(len(s.slots))
slot := s.slots[h] slot := s.slots[h]
if slot == nil || len(slot) == 0 { if len(slot) == 0 {
return false return false
} }
for _, element := range slot { for _, element := range slot {

469
util/util_cjk.go Normal file
View file

@ -0,0 +1,469 @@
package util
import "unicode"
var cjkRadicalsSupplement = &unicode.RangeTable{
R16: []unicode.Range16{
{0x2E80, 0x2EFF, 1},
},
}
var kangxiRadicals = &unicode.RangeTable{
R16: []unicode.Range16{
{0x2F00, 0x2FDF, 1},
},
}
var ideographicDescriptionCharacters = &unicode.RangeTable{
R16: []unicode.Range16{
{0x2FF0, 0x2FFF, 1},
},
}
var cjkSymbolsAndPunctuation = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3000, 0x303F, 1},
},
}
var hiragana = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3040, 0x309F, 1},
},
}
var katakana = &unicode.RangeTable{
R16: []unicode.Range16{
{0x30A0, 0x30FF, 1},
},
}
var kanbun = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3130, 0x318F, 1},
{0x3190, 0x319F, 1},
},
}
var cjkStrokes = &unicode.RangeTable{
R16: []unicode.Range16{
{0x31C0, 0x31EF, 1},
},
}
var katakanaPhoneticExtensions = &unicode.RangeTable{
R16: []unicode.Range16{
{0x31F0, 0x31FF, 1},
},
}
var cjkCompatibility = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3300, 0x33FF, 1},
},
}
var cjkUnifiedIdeographsExtensionA = &unicode.RangeTable{
R16: []unicode.Range16{
{0x3400, 0x4DBF, 1},
},
}
var cjkUnifiedIdeographs = &unicode.RangeTable{
R16: []unicode.Range16{
{0x4E00, 0x9FFF, 1},
},
}
var yiSyllables = &unicode.RangeTable{
R16: []unicode.Range16{
{0xA000, 0xA48F, 1},
},
}
var yiRadicals = &unicode.RangeTable{
R16: []unicode.Range16{
{0xA490, 0xA4CF, 1},
},
}
var cjkCompatibilityIdeographs = &unicode.RangeTable{
R16: []unicode.Range16{
{0xF900, 0xFAFF, 1},
},
}
var verticalForms = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFE10, 0xFE1F, 1},
},
}
var cjkCompatibilityForms = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFE30, 0xFE4F, 1},
},
}
var smallFormVariants = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFE50, 0xFE6F, 1},
},
}
var halfwidthAndFullwidthForms = &unicode.RangeTable{
R16: []unicode.Range16{
{0xFF00, 0xFFEF, 1},
},
}
var kanaSupplement = &unicode.RangeTable{
R32: []unicode.Range32{
{0x1B000, 0x1B0FF, 1},
},
}
var kanaExtendedA = &unicode.RangeTable{
R32: []unicode.Range32{
{0x1B100, 0x1B12F, 1},
},
}
var smallKanaExtension = &unicode.RangeTable{
R32: []unicode.Range32{
{0x1B130, 0x1B16F, 1},
},
}
var cjkUnifiedIdeographsExtensionB = &unicode.RangeTable{
R32: []unicode.Range32{
{0x20000, 0x2A6DF, 1},
},
}
var cjkUnifiedIdeographsExtensionC = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2A700, 0x2B73F, 1},
},
}
var cjkUnifiedIdeographsExtensionD = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2B740, 0x2B81F, 1},
},
}
var cjkUnifiedIdeographsExtensionE = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2B820, 0x2CEAF, 1},
},
}
var cjkUnifiedIdeographsExtensionF = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2CEB0, 0x2EBEF, 1},
},
}
var cjkCompatibilityIdeographsSupplement = &unicode.RangeTable{
R32: []unicode.Range32{
{0x2F800, 0x2FA1F, 1},
},
}
var cjkUnifiedIdeographsExtensionG = &unicode.RangeTable{
R32: []unicode.Range32{
{0x30000, 0x3134F, 1},
},
}
// IsEastAsianWideRune returns trhe if the given rune is an east asian wide character, otherwise false.
func IsEastAsianWideRune(r rune) bool {
return unicode.Is(unicode.Hiragana, r) ||
unicode.Is(unicode.Katakana, r) ||
unicode.Is(unicode.Han, r) ||
unicode.Is(unicode.Lm, r) ||
unicode.Is(unicode.Hangul, r) ||
unicode.Is(cjkSymbolsAndPunctuation, r)
}
// IsSpaceDiscardingUnicodeRune returns true if the given rune is space-discarding unicode character, otherwise false.
// See https://www.w3.org/TR/2020/WD-css-text-3-20200429/#space-discard-set
func IsSpaceDiscardingUnicodeRune(r rune) bool {
return unicode.Is(cjkRadicalsSupplement, r) ||
unicode.Is(kangxiRadicals, r) ||
unicode.Is(ideographicDescriptionCharacters, r) ||
unicode.Is(cjkSymbolsAndPunctuation, r) ||
unicode.Is(hiragana, r) ||
unicode.Is(katakana, r) ||
unicode.Is(kanbun, r) ||
unicode.Is(cjkStrokes, r) ||
unicode.Is(katakanaPhoneticExtensions, r) ||
unicode.Is(cjkCompatibility, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionA, r) ||
unicode.Is(cjkUnifiedIdeographs, r) ||
unicode.Is(yiSyllables, r) ||
unicode.Is(yiRadicals, r) ||
unicode.Is(cjkCompatibilityIdeographs, r) ||
unicode.Is(verticalForms, r) ||
unicode.Is(cjkCompatibilityForms, r) ||
unicode.Is(smallFormVariants, r) ||
unicode.Is(halfwidthAndFullwidthForms, r) ||
unicode.Is(kanaSupplement, r) ||
unicode.Is(kanaExtendedA, r) ||
unicode.Is(smallKanaExtension, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionB, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionC, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionD, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionE, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionF, r) ||
unicode.Is(cjkCompatibilityIdeographsSupplement, r) ||
unicode.Is(cjkUnifiedIdeographsExtensionG, r)
}
// EastAsianWidth returns the east asian width of the given rune.
// See https://www.unicode.org/reports/tr11/tr11-36.html
func EastAsianWidth(r rune) string {
switch {
case r == 0x3000,
(0xFF01 <= r && r <= 0xFF60),
(0xFFE0 <= r && r <= 0xFFE6):
return "F"
case r == 0x20A9,
(0xFF61 <= r && r <= 0xFFBE),
(0xFFC2 <= r && r <= 0xFFC7),
(0xFFCA <= r && r <= 0xFFCF),
(0xFFD2 <= r && r <= 0xFFD7),
(0xFFDA <= r && r <= 0xFFDC),
(0xFFE8 <= r && r <= 0xFFEE):
return "H"
case (0x1100 <= r && r <= 0x115F),
(0x11A3 <= r && r <= 0x11A7),
(0x11FA <= r && r <= 0x11FF),
(0x2329 <= r && r <= 0x232A),
(0x2E80 <= r && r <= 0x2E99),
(0x2E9B <= r && r <= 0x2EF3),
(0x2F00 <= r && r <= 0x2FD5),
(0x2FF0 <= r && r <= 0x2FFB),
(0x3001 <= r && r <= 0x303E),
(0x3041 <= r && r <= 0x3096),
(0x3099 <= r && r <= 0x30FF),
(0x3105 <= r && r <= 0x312D),
(0x3131 <= r && r <= 0x318E),
(0x3190 <= r && r <= 0x31BA),
(0x31C0 <= r && r <= 0x31E3),
(0x31F0 <= r && r <= 0x321E),
(0x3220 <= r && r <= 0x3247),
(0x3250 <= r && r <= 0x32FE),
(0x3300 <= r && r <= 0x4DBF),
(0x4E00 <= r && r <= 0xA48C),
(0xA490 <= r && r <= 0xA4C6),
(0xA960 <= r && r <= 0xA97C),
(0xAC00 <= r && r <= 0xD7A3),
(0xD7B0 <= r && r <= 0xD7C6),
(0xD7CB <= r && r <= 0xD7FB),
(0xF900 <= r && r <= 0xFAFF),
(0xFE10 <= r && r <= 0xFE19),
(0xFE30 <= r && r <= 0xFE52),
(0xFE54 <= r && r <= 0xFE66),
(0xFE68 <= r && r <= 0xFE6B),
(0x1B000 <= r && r <= 0x1B001),
(0x1F200 <= r && r <= 0x1F202),
(0x1F210 <= r && r <= 0x1F23A),
(0x1F240 <= r && r <= 0x1F248),
(0x1F250 <= r && r <= 0x1F251),
(0x20000 <= r && r <= 0x2F73F),
(0x2B740 <= r && r <= 0x2FFFD),
(0x30000 <= r && r <= 0x3FFFD):
return "W"
case (0x0020 <= r && r <= 0x007E),
(0x00A2 <= r && r <= 0x00A3),
(0x00A5 <= r && r <= 0x00A6),
r == 0x00AC,
r == 0x00AF,
(0x27E6 <= r && r <= 0x27ED),
(0x2985 <= r && r <= 0x2986):
return "Na"
case (0x00A1 == r),
(0x00A4 == r),
(0x00A7 <= r && r <= 0x00A8),
(0x00AA == r),
(0x00AD <= r && r <= 0x00AE),
(0x00B0 <= r && r <= 0x00B4),
(0x00B6 <= r && r <= 0x00BA),
(0x00BC <= r && r <= 0x00BF),
(0x00C6 == r),
(0x00D0 == r),
(0x00D7 <= r && r <= 0x00D8),
(0x00DE <= r && r <= 0x00E1),
(0x00E6 == r),
(0x00E8 <= r && r <= 0x00EA),
(0x00EC <= r && r <= 0x00ED),
(0x00F0 == r),
(0x00F2 <= r && r <= 0x00F3),
(0x00F7 <= r && r <= 0x00FA),
(0x00FC == r),
(0x00FE == r),
(0x0101 == r),
(0x0111 == r),
(0x0113 == r),
(0x011B == r),
(0x0126 <= r && r <= 0x0127),
(0x012B == r),
(0x0131 <= r && r <= 0x0133),
(0x0138 == r),
(0x013F <= r && r <= 0x0142),
(0x0144 == r),
(0x0148 <= r && r <= 0x014B),
(0x014D == r),
(0x0152 <= r && r <= 0x0153),
(0x0166 <= r && r <= 0x0167),
(0x016B == r),
(0x01CE == r),
(0x01D0 == r),
(0x01D2 == r),
(0x01D4 == r),
(0x01D6 == r),
(0x01D8 == r),
(0x01DA == r),
(0x01DC == r),
(0x0251 == r),
(0x0261 == r),
(0x02C4 == r),
(0x02C7 == r),
(0x02C9 <= r && r <= 0x02CB),
(0x02CD == r),
(0x02D0 == r),
(0x02D8 <= r && r <= 0x02DB),
(0x02DD == r),
(0x02DF == r),
(0x0300 <= r && r <= 0x036F),
(0x0391 <= r && r <= 0x03A1),
(0x03A3 <= r && r <= 0x03A9),
(0x03B1 <= r && r <= 0x03C1),
(0x03C3 <= r && r <= 0x03C9),
(0x0401 == r),
(0x0410 <= r && r <= 0x044F),
(0x0451 == r),
(0x2010 == r),
(0x2013 <= r && r <= 0x2016),
(0x2018 <= r && r <= 0x2019),
(0x201C <= r && r <= 0x201D),
(0x2020 <= r && r <= 0x2022),
(0x2024 <= r && r <= 0x2027),
(0x2030 == r),
(0x2032 <= r && r <= 0x2033),
(0x2035 == r),
(0x203B == r),
(0x203E == r),
(0x2074 == r),
(0x207F == r),
(0x2081 <= r && r <= 0x2084),
(0x20AC == r),
(0x2103 == r),
(0x2105 == r),
(0x2109 == r),
(0x2113 == r),
(0x2116 == r),
(0x2121 <= r && r <= 0x2122),
(0x2126 == r),
(0x212B == r),
(0x2153 <= r && r <= 0x2154),
(0x215B <= r && r <= 0x215E),
(0x2160 <= r && r <= 0x216B),
(0x2170 <= r && r <= 0x2179),
(0x2189 == r),
(0x2190 <= r && r <= 0x2199),
(0x21B8 <= r && r <= 0x21B9),
(0x21D2 == r),
(0x21D4 == r),
(0x21E7 == r),
(0x2200 == r),
(0x2202 <= r && r <= 0x2203),
(0x2207 <= r && r <= 0x2208),
(0x220B == r),
(0x220F == r),
(0x2211 == r),
(0x2215 == r),
(0x221A == r),
(0x221D <= r && r <= 0x2220),
(0x2223 == r),
(0x2225 == r),
(0x2227 <= r && r <= 0x222C),
(0x222E == r),
(0x2234 <= r && r <= 0x2237),
(0x223C <= r && r <= 0x223D),
(0x2248 == r),
(0x224C == r),
(0x2252 == r),
(0x2260 <= r && r <= 0x2261),
(0x2264 <= r && r <= 0x2267),
(0x226A <= r && r <= 0x226B),
(0x226E <= r && r <= 0x226F),
(0x2282 <= r && r <= 0x2283),
(0x2286 <= r && r <= 0x2287),
(0x2295 == r),
(0x2299 == r),
(0x22A5 == r),
(0x22BF == r),
(0x2312 == r),
(0x2460 <= r && r <= 0x24E9),
(0x24EB <= r && r <= 0x254B),
(0x2550 <= r && r <= 0x2573),
(0x2580 <= r && r <= 0x258F),
(0x2592 <= r && r <= 0x2595),
(0x25A0 <= r && r <= 0x25A1),
(0x25A3 <= r && r <= 0x25A9),
(0x25B2 <= r && r <= 0x25B3),
(0x25B6 <= r && r <= 0x25B7),
(0x25BC <= r && r <= 0x25BD),
(0x25C0 <= r && r <= 0x25C1),
(0x25C6 <= r && r <= 0x25C8),
(0x25CB == r),
(0x25CE <= r && r <= 0x25D1),
(0x25E2 <= r && r <= 0x25E5),
(0x25EF == r),
(0x2605 <= r && r <= 0x2606),
(0x2609 == r),
(0x260E <= r && r <= 0x260F),
(0x2614 <= r && r <= 0x2615),
(0x261C == r),
(0x261E == r),
(0x2640 == r),
(0x2642 == r),
(0x2660 <= r && r <= 0x2661),
(0x2663 <= r && r <= 0x2665),
(0x2667 <= r && r <= 0x266A),
(0x266C <= r && r <= 0x266D),
(0x266F == r),
(0x269E <= r && r <= 0x269F),
(0x26BE <= r && r <= 0x26BF),
(0x26C4 <= r && r <= 0x26CD),
(0x26CF <= r && r <= 0x26E1),
(0x26E3 == r),
(0x26E8 <= r && r <= 0x26FF),
(0x273D == r),
(0x2757 == r),
(0x2776 <= r && r <= 0x277F),
(0x2B55 <= r && r <= 0x2B59),
(0x3248 <= r && r <= 0x324F),
(0xE000 <= r && r <= 0xF8FF),
(0xFE00 <= r && r <= 0xFE0F),
(0xFFFD == r),
(0x1F100 <= r && r <= 0x1F10A),
(0x1F110 <= r && r <= 0x1F12D),
(0x1F130 <= r && r <= 0x1F169),
(0x1F170 <= r && r <= 0x1F19A),
(0xE0100 <= r && r <= 0xE01EF),
(0xF0000 <= r && r <= 0xFFFFD),
(0x100000 <= r && r <= 0x10FFFD):
return "A"
default:
return "N"
}
}

View file

@ -1,3 +1,4 @@
//go:build appengine || js
// +build appengine js // +build appengine js
package util package util

View file

@ -1,4 +1,5 @@
// +build !appengine,!js //go:build !appengine && !js && !go1.21
// +build !appengine,!js,!go1.21
package util package util

18
util/util_unsafe_go121.go Normal file
View file

@ -0,0 +1,18 @@
//go:build !appengine && !js && go1.21
// +build !appengine,!js,go1.21
package util
import (
"unsafe"
)
// BytesToReadOnlyString returns a string converted from given bytes.
func BytesToReadOnlyString(b []byte) string {
return unsafe.String(unsafe.SliceData(b), len(b))
}
// StringToReadOnlyBytes returns bytes converted from given string.
func StringToReadOnlyBytes(s string) []byte {
return unsafe.Slice(unsafe.StringData(s), len(s))
}