Compare commits

..

7 Commits

Author SHA1 Message Date
Claude 8df3b65280 Preserve original formatting in Unmarshaler by using raw byte ranges
Instead of reconstructing key-value lines from parsed components, now
uses the original raw bytes from the document. This preserves:
- Whitespace around '=' (e.g., "key   =   value")
- String quoting style (basic vs literal)
- Number formats (hex, octal, binary)
- Inline table formatting

Changes:
- Add Raw range tracking to KeyValue expressions in parseKeyval
- Update handleKeyValuesUnmarshaler to use expr.Raw directly
- Remove keyNeedsQuoting helper (no longer needed)
- Add TestIssue873_FormattingPreservation test
- Update expected output in ExampleParser_comments
2026-01-26 21:30:16 +00:00
Claude 6c995ec13e Add Example tests and fix raw value extraction for boolean types
Add two godoc Example tests:
- ExampleDecoder_EnableUnmarshalerInterface_dynamicConfig: shows dynamic
  unmarshaling based on a type field
- ExampleDecoder_EnableUnmarshalerInterface_rawMessage: demonstrates
  RawMessage usage for deferred parsing

Fix handleKeyValuesUnmarshaler to handle values where Raw.Length == 0
(like boolean types) by using value.Data as fallback.
2026-01-26 21:30:15 +00:00
Claude 9f1bb6c97d Add double pointer test to achieve 100% coverage for handleKeyValues
Add TestIssue873_DoublePointerUnmarshaler to test pointer-to-pointer
to Unmarshaler types. This covers the pointer dereferencing loop in
handleKeyValues, bringing its coverage from 88% to 100%.

Total coverage: 97.4%
2026-01-26 21:30:15 +00:00
Claude d3b9283095 Add test for dotted keys to improve coverage
Add TestIssue873_DottedKeys to test dotted key handling (e.g., sub.key = value)
in the Unmarshaler interface. This improves coverage for handleKeyValuesUnmarshaler
from 93.3% to 96.7%.
2026-01-26 21:30:15 +00:00
Claude 013ffc24b8 Fix lint issues and improve test coverage for Unmarshaler interface
- Apply De Morgan's law in keyNeedsQuoting to satisfy staticcheck QF1001
- Remove unused splitTableUnmarshaler type from test
- Fix unused parameter lint warning in errorUnmarshaler873
- Add test for quoted keys that need special handling
- Add test for error propagation from UnmarshalTOML
- Update customTable873 parser to handle quoted keys properly

Coverage improved:
- handleKeyValuesUnmarshaler: 80.0% -> 93.3%
- keyNeedsQuoting: 66.7% -> 83.3%
- Overall main package: 97.2% -> 97.5%
2026-01-26 21:30:15 +00:00
Claude 2762e24a9c Implement bytes-based Unmarshaler interface for tables and arrays (#873)
This change brings back support for the unstable.Unmarshaler interface
for tables and array tables, addressing issue #873.

Key changes:
- Changed UnmarshalTOML signature from (*Node) to ([]byte) to provide
  raw TOML bytes instead of AST nodes
- Added RawMessage type (similar to json.RawMessage) for capturing raw
  TOML bytes for later processing
- Updated handleKeyValuesUnmarshaler to reconstruct key-value lines
  from the parsed keys and raw value bytes
- Added support for slice types implementing Unmarshaler (e.g., RawMessage)
- Removed unused AST helper functions from unstable/ast.go

The bytes-based interface allows users to:
- Get raw TOML bytes for custom parsing
- Delay TOML decoding using RawMessage
- Implement custom unmarshaling logic for complex types

Tests added for:
- Table unmarshaler with various scenarios
- Array table unmarshaler
- Split tables (same parent defined in multiple places)
- RawMessage usage
- Nested tables and mixed regular fields
2026-01-26 21:30:14 +00:00
Claude 5b6828661c Support Unmarshaler interface for tables and array tables (#873)
Extend the unstable.Unmarshaler interface support to work with tables
and array tables, not just single values.

When a type implementing unstable.Unmarshaler is the target of a table
(e.g., [table] or [[array]]), the UnmarshalTOML method receives a
synthetic InlineTable node containing all the key-value pairs belonging
to that table.

Key changes:
- Add handleKeyValuesUnmarshaler to collect and process table content
- Add copyExpressionNodes to deep-copy AST nodes for synthetic tables
- Add helper functions in unstable/ast.go for node manipulation
- Update documentation for EnableUnmarshalerInterface
- Add comprehensive tests for table and array table unmarshaling
2026-01-26 21:30:14 +00:00
24 changed files with 619 additions and 474 deletions
+1 -1
View File
@@ -19,7 +19,7 @@ jobs:
dry-run: false dry-run: false
language: go language: go
- name: Upload Crash - name: Upload Crash
uses: actions/upload-artifact@v7 uses: actions/upload-artifact@v6
if: failure() && steps.build.outcome == 'success' if: failure() && steps.build.outcome == 'success'
with: with:
name: artifacts name: artifacts
+1 -1
View File
@@ -15,6 +15,6 @@ jobs:
- name: Setup go - name: Setup go
uses: actions/setup-go@v6 uses: actions/setup-go@v6
with: with:
go-version: "1.26" go-version: "1.25"
- name: Run tests with coverage - name: Run tests with coverage
run: ./ci.sh coverage -d "${GITHUB_BASE_REF-HEAD}" run: ./ci.sh coverage -d "${GITHUB_BASE_REF-HEAD}"
+2 -2
View File
@@ -20,7 +20,7 @@ jobs:
fetch-depth: 0 fetch-depth: 0
- name: Set up Docker Buildx - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v4 uses: docker/setup-buildx-action@v3
- name: Run Go versions compatibility test - name: Run Go versions compatibility test
run: | run: |
@@ -28,7 +28,7 @@ jobs:
./test-go-versions.sh --output ./test-results $VERSIONS ./test-go-versions.sh --output ./test-results $VERSIONS
- name: Upload test results - name: Upload test results
uses: actions/upload-artifact@v7 uses: actions/upload-artifact@v6
with: with:
name: go-versions-test-results name: go-versions-test-results
path: | path: |
+1 -1
View File
@@ -15,7 +15,7 @@ jobs:
- name: Setup go - name: Setup go
uses: actions/setup-go@v6 uses: actions/setup-go@v6
with: with:
go-version: "1.26" go-version: "1.24"
- name: Run golangci-lint - name: Run golangci-lint
uses: golangci/golangci-lint-action@v9 uses: golangci/golangci-lint-action@v9
with: with:
+3 -3
View File
@@ -22,15 +22,15 @@ jobs:
- name: Set up Go - name: Set up Go
uses: actions/setup-go@v6 uses: actions/setup-go@v6
with: with:
go-version: "1.26" go-version: "1.25"
- name: Login to GitHub Container Registry - name: Login to GitHub Container Registry
uses: docker/login-action@v4 uses: docker/login-action@v3
with: with:
registry: ghcr.io registry: ghcr.io
username: ${{ github.actor }} username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }} password: ${{ secrets.GITHUB_TOKEN }}
- name: Run GoReleaser - name: Run GoReleaser
uses: goreleaser/goreleaser-action@v7 uses: goreleaser/goreleaser-action@v6
with: with:
distribution: goreleaser distribution: goreleaser
version: '~> v2' version: '~> v2'
+1 -2
View File
@@ -10,10 +10,9 @@ on:
jobs: jobs:
build: build:
strategy: strategy:
fail-fast: false
matrix: matrix:
os: [ 'ubuntu-latest', 'windows-latest', 'macos-latest', 'macos-14' ] os: [ 'ubuntu-latest', 'windows-latest', 'macos-latest', 'macos-14' ]
go: [ '1.25', '1.26' ] go: [ '1.24', '1.25' ]
runs-on: ${{ matrix.os }} runs-on: ${{ matrix.os }}
name: ${{ matrix.go }}/${{ matrix.os }} name: ${{ matrix.go }}/${{ matrix.os }}
steps: steps:
+3
View File
@@ -22,6 +22,7 @@ builds:
- linux_riscv64 - linux_riscv64
- windows_amd64 - windows_amd64
- windows_arm64 - windows_arm64
- windows_arm
- darwin_amd64 - darwin_amd64
- darwin_arm64 - darwin_arm64
- id: tomljson - id: tomljson
@@ -41,6 +42,7 @@ builds:
- linux_riscv64 - linux_riscv64
- windows_amd64 - windows_amd64
- windows_arm64 - windows_arm64
- windows_arm
- darwin_amd64 - darwin_amd64
- darwin_arm64 - darwin_arm64
- id: jsontoml - id: jsontoml
@@ -60,6 +62,7 @@ builds:
- linux_arm - linux_arm
- windows_amd64 - windows_amd64
- windows_arm64 - windows_arm64
- windows_arm
- darwin_amd64 - darwin_amd64
- darwin_arm64 - darwin_arm64
universal_binaries: universal_binaries:
+27 -27
View File
@@ -235,17 +235,17 @@ the AST level. See https://pkg.go.dev/github.com/pelletier/go-toml/v2/unstable.
Execution time speedup compared to other Go TOML libraries: Execution time speedup compared to other Go TOML libraries:
<table> <table>
<thead> <thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr> <tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead> </thead>
<tbody> <tbody>
<tr><td>Marshal/HugoFrontMatter-2</td><td>2.1x</td><td>2.0x</td></tr> <tr><td>Marshal/HugoFrontMatter-2</td><td>1.9x</td><td>2.2x</td></tr>
<tr><td>Marshal/ReferenceFile/map-2</td><td>2.0x</td><td>2.0x</td></tr> <tr><td>Marshal/ReferenceFile/map-2</td><td>1.7x</td><td>2.1x</td></tr>
<tr><td>Marshal/ReferenceFile/struct-2</td><td>2.3x</td><td>2.5x</td></tr> <tr><td>Marshal/ReferenceFile/struct-2</td><td>2.2x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/HugoFrontMatter-2</td><td>3.3x</td><td>2.8x</td></tr> <tr><td>Unmarshal/HugoFrontMatter-2</td><td>2.9x</td><td>2.7x</td></tr>
<tr><td>Unmarshal/ReferenceFile/map-2</td><td>2.9x</td><td>3.0x</td></tr> <tr><td>Unmarshal/ReferenceFile/map-2</td><td>2.6x</td><td>2.7x</td></tr>
<tr><td>Unmarshal/ReferenceFile/struct-2</td><td>4.8x</td><td>5.0x</td></tr> <tr><td>Unmarshal/ReferenceFile/struct-2</td><td>4.6x</td><td>5.1x</td></tr>
</tbody> </tbody>
</table> </table>
<details><summary>See more</summary> <details><summary>See more</summary>
<p>The table above has the results of the most common use-cases. The table below <p>The table above has the results of the most common use-cases. The table below
@@ -253,22 +253,22 @@ contains the results of all benchmarks, including unrealistic ones. It is
provided for completeness.</p> provided for completeness.</p>
<table> <table>
<thead> <thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr> <tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead> </thead>
<tbody> <tbody>
<tr><td>Marshal/SimpleDocument/map-2</td><td>2.0x</td><td>2.9x</td></tr> <tr><td>Marshal/SimpleDocument/map-2</td><td>1.8x</td><td>2.7x</td></tr>
<tr><td>Marshal/SimpleDocument/struct-2</td><td>2.5x</td><td>3.6x</td></tr> <tr><td>Marshal/SimpleDocument/struct-2</td><td>2.7x</td><td>3.8x</td></tr>
<tr><td>Unmarshal/SimpleDocument/map-2</td><td>4.2x</td><td>3.4x</td></tr> <tr><td>Unmarshal/SimpleDocument/map-2</td><td>3.8x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/SimpleDocument/struct-2</td><td>5.9x</td><td>4.4x</td></tr> <tr><td>Unmarshal/SimpleDocument/struct-2</td><td>5.6x</td><td>4.1x</td></tr>
<tr><td>UnmarshalDataset/example-2</td><td>3.2x</td><td>2.9x</td></tr> <tr><td>UnmarshalDataset/example-2</td><td>3.0x</td><td>3.2x</td></tr>
<tr><td>UnmarshalDataset/code-2</td><td>2.4x</td><td>2.8x</td></tr> <tr><td>UnmarshalDataset/code-2</td><td>2.3x</td><td>2.9x</td></tr>
<tr><td>UnmarshalDataset/twitter-2</td><td>2.7x</td><td>2.5x</td></tr> <tr><td>UnmarshalDataset/twitter-2</td><td>2.6x</td><td>2.7x</td></tr>
<tr><td>UnmarshalDataset/citm_catalog-2</td><td>2.3x</td><td>2.3x</td></tr> <tr><td>UnmarshalDataset/citm_catalog-2</td><td>2.2x</td><td>2.3x</td></tr>
<tr><td>UnmarshalDataset/canada-2</td><td>1.9x</td><td>1.5x</td></tr> <tr><td>UnmarshalDataset/canada-2</td><td>1.8x</td><td>1.5x</td></tr>
<tr><td>UnmarshalDataset/config-2</td><td>5.4x</td><td>3.0x</td></tr> <tr><td>UnmarshalDataset/config-2</td><td>4.1x</td><td>2.9x</td></tr>
<tr><td>geomean</td><td>2.9x</td><td>2.8x</td></tr> <tr><td>geomean</td><td>2.7x</td><td>2.8x</td></tr>
</tbody> </tbody>
</table> </table>
<p>This table can be generated with <code>./ci.sh benchmark -a -html</code>.</p> <p>This table can be generated with <code>./ci.sh benchmark -a -html</code>.</p>
</details> </details>
+1 -6
View File
@@ -147,7 +147,7 @@ bench() {
pushd "$dir" pushd "$dir"
if [ "${replace}" != "" ]; then if [ "${replace}" != "" ]; then
find ./benchmark/ -iname '*.go' -exec sed -i -E "s|github.com/pelletier/go-toml/v2\"|${replace}\"|g" {} \; find ./benchmark/ -iname '*.go' -exec sed -i -E "s|github.com/pelletier/go-toml/v2|${replace}|g" {} \;
go get "${replace}" go get "${replace}"
fi fi
@@ -195,11 +195,6 @@ for line in reversed(lines[2:]):
"%.1fx" % (float(line[3])/v2), # v1 "%.1fx" % (float(line[3])/v2), # v1
"%.1fx" % (float(line[7])/v2), # bs "%.1fx" % (float(line[7])/v2), # bs
]) ])
if not results:
print("No benchmark results to display.", file=sys.stderr)
sys.exit(1)
# move geomean to the end # move geomean to the end
results.append(results[0]) results.append(results[0])
del results[0] del results[0]
+93 -78
View File
@@ -9,60 +9,64 @@ import (
"github.com/pelletier/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
func parseInteger(b []byte, base int) (int64, error) { func parseInteger(b []byte) (int64, error) {
if len(b) > 2 && b[0] == '0' { if len(b) > 2 && b[0] == '0' {
switch b[1] { switch b[1] {
case 'x': case 'x':
return parseIntHex(b, base) return parseIntHex(b)
case 'b': case 'b':
return parseIntBin(b, base) return parseIntBin(b)
case 'o': case 'o':
return parseIntOct(b, base) return parseIntOct(b)
default: default:
panic(fmt.Errorf("invalid base '%c', should have been checked by scanIntOrFloat", b[1])) panic(fmt.Errorf("invalid base '%c', should have been checked by scanIntOrFloat", b[1]))
} }
} }
return parseIntDec(b, base) return parseIntDec(b)
} }
func parseLocalDate(b []byte, base int) (LocalDate, error) { func parseLocalDate(b []byte) (LocalDate, error) {
// full-date = date-fullyear "-" date-month "-" date-mday
// date-fullyear = 4DIGIT
// date-month = 2DIGIT ; 01-12
// date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on month/year
var date LocalDate var date LocalDate
if len(b) != 10 || b[4] != '-' || b[7] != '-' { if len(b) != 10 || b[4] != '-' || b[7] != '-' {
return date, unstable.NewParserError(b, base, "dates are expected to have the format YYYY-MM-DD") return date, unstable.NewParserError(b, "dates are expected to have the format YYYY-MM-DD")
} }
var err error var err error
date.Year, err = parseDecimalDigits(b[0:4], base) date.Year, err = parseDecimalDigits(b[0:4])
if err != nil { if err != nil {
return LocalDate{}, err return LocalDate{}, err
} }
date.Month, err = parseDecimalDigits(b[5:7], base+5) date.Month, err = parseDecimalDigits(b[5:7])
if err != nil { if err != nil {
return LocalDate{}, err return LocalDate{}, err
} }
date.Day, err = parseDecimalDigits(b[8:10], base+8) date.Day, err = parseDecimalDigits(b[8:10])
if err != nil { if err != nil {
return LocalDate{}, err return LocalDate{}, err
} }
if !isValidDate(date.Year, date.Month, date.Day) { if !isValidDate(date.Year, date.Month, date.Day) {
return LocalDate{}, unstable.NewParserError(b, base, "impossible date") return LocalDate{}, unstable.NewParserError(b, "impossible date")
} }
return date, nil return date, nil
} }
func parseDecimalDigits(b []byte, base int) (int, error) { func parseDecimalDigits(b []byte) (int, error) {
v := 0 v := 0
for i, c := range b { for i, c := range b {
if c < '0' || c > '9' { if c < '0' || c > '9' {
return 0, unstable.NewParserError(b[i:i+1], base+i, "expected digit (0-9)") return 0, unstable.NewParserError(b[i:i+1], "expected digit (0-9)")
} }
v *= 10 v *= 10
v += int(c - '0') v += int(c - '0')
@@ -71,18 +75,21 @@ func parseDecimalDigits(b []byte, base int) (int, error) {
return v, nil return v, nil
} }
func parseDateTime(b []byte, base int) (time.Time, error) { func parseDateTime(b []byte) (time.Time, error) {
origLen := len(b) // offset-date-time = full-date time-delim full-time
dt, b, err := parseLocalDateTime(b, base) // full-time = partial-time time-offset
// time-offset = "Z" / time-numoffset
// time-numoffset = ( "+" / "-" ) time-hour ":" time-minute
dt, b, err := parseLocalDateTime(b)
if err != nil { if err != nil {
return time.Time{}, err return time.Time{}, err
} }
tzBase := base + origLen - len(b)
var zone *time.Location var zone *time.Location
if len(b) == 0 { if len(b) == 0 {
// parser should have checked that when assigning the date time node
panic("date time should have a timezone") panic("date time should have a timezone")
} }
@@ -92,7 +99,7 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
} else { } else {
const dateTimeByteLen = 6 const dateTimeByteLen = 6
if len(b) != dateTimeByteLen { if len(b) != dateTimeByteLen {
return time.Time{}, unstable.NewParserError(b, tzBase, "invalid date-time timezone") return time.Time{}, unstable.NewParserError(b, "invalid date-time timezone")
} }
var direction int var direction int
switch b[0] { switch b[0] {
@@ -101,27 +108,27 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
case '+': case '+':
direction = +1 direction = +1
default: default:
return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset character") return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset character")
} }
if b[3] != ':' { if b[3] != ':' {
return time.Time{}, unstable.NewParserError(b[3:4], tzBase+3, "expected a : separator") return time.Time{}, unstable.NewParserError(b[3:4], "expected a : separator")
} }
hours, err := parseDecimalDigits(b[1:3], tzBase+1) hours, err := parseDecimalDigits(b[1:3])
if err != nil { if err != nil {
return time.Time{}, err return time.Time{}, err
} }
if hours > 23 { if hours > 23 {
return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset hours") return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset hours")
} }
minutes, err := parseDecimalDigits(b[4:6], tzBase+4) minutes, err := parseDecimalDigits(b[4:6])
if err != nil { if err != nil {
return time.Time{}, err return time.Time{}, err
} }
if minutes > 59 { if minutes > 59 {
return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset minutes") return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset minutes")
} }
seconds := direction * (hours*3600 + minutes*60) seconds := direction * (hours*3600 + minutes*60)
@@ -134,7 +141,7 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
} }
if len(b) > 0 { if len(b) > 0 {
return time.Time{}, unstable.NewParserError(b, tzBase, "extra bytes at the end of the timezone") return time.Time{}, unstable.NewParserError(b, "extra bytes at the end of the timezone")
} }
t := time.Date( t := time.Date(
@@ -150,15 +157,15 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
return t, nil return t, nil
} }
func parseLocalDateTime(b []byte, base int) (LocalDateTime, []byte, error) { func parseLocalDateTime(b []byte) (LocalDateTime, []byte, error) {
var dt LocalDateTime var dt LocalDateTime
const localDateTimeByteMinLen = 11 const localDateTimeByteMinLen = 11
if len(b) < localDateTimeByteMinLen { if len(b) < localDateTimeByteMinLen {
return dt, nil, unstable.NewParserError(b, base, "local datetimes are expected to have the format YYYY-MM-DDTHH:MM:SS[.NNNNNNNNN]") return dt, nil, unstable.NewParserError(b, "local datetimes are expected to have the format YYYY-MM-DDTHH:MM:SS[.NNNNNNNNN]")
} }
date, err := parseLocalDate(b[:10], base) date, err := parseLocalDate(b[:10])
if err != nil { if err != nil {
return dt, nil, err return dt, nil, err
} }
@@ -166,10 +173,10 @@ func parseLocalDateTime(b []byte, base int) (LocalDateTime, []byte, error) {
sep := b[10] sep := b[10]
if sep != 'T' && sep != ' ' && sep != 't' { if sep != 'T' && sep != ' ' && sep != 't' {
return dt, nil, unstable.NewParserError(b[10:11], base+10, "datetime separator is expected to be T or a space") return dt, nil, unstable.NewParserError(b[10:11], "datetime separator is expected to be T or a space")
} }
t, rest, err := parseLocalTime(b[11:], base+11) t, rest, err := parseLocalTime(b[11:])
if err != nil { if err != nil {
return dt, nil, err return dt, nil, err
} }
@@ -181,53 +188,53 @@ func parseLocalDateTime(b []byte, base int) (LocalDateTime, []byte, error) {
// parseLocalTime is a bit different because it also returns the remaining // parseLocalTime is a bit different because it also returns the remaining
// []byte that is didn't need. This is to allow parseDateTime to parse those // []byte that is didn't need. This is to allow parseDateTime to parse those
// remaining bytes as a timezone. // remaining bytes as a timezone.
func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) { func parseLocalTime(b []byte) (LocalTime, []byte, error) {
var ( var (
nspow = [10]int{0, 1e8, 1e7, 1e6, 1e5, 1e4, 1e3, 1e2, 1e1, 1e0} nspow = [10]int{0, 1e8, 1e7, 1e6, 1e5, 1e4, 1e3, 1e2, 1e1, 1e0}
t LocalTime t LocalTime
) )
// check if b matches to have expected format HH:MM:SS[.NNNNNN]
const localTimeByteLen = 8 const localTimeByteLen = 8
if len(b) < localTimeByteLen { if len(b) < localTimeByteLen {
return t, nil, unstable.NewParserError(b, base, "times are expected to have the format HH:MM:SS[.NNNNNN]") return t, nil, unstable.NewParserError(b, "times are expected to have the format HH:MM:SS[.NNNNNN]")
} }
var err error var err error
t.Hour, err = parseDecimalDigits(b[0:2], base) t.Hour, err = parseDecimalDigits(b[0:2])
if err != nil { if err != nil {
return t, nil, err return t, nil, err
} }
if t.Hour > 23 { if t.Hour > 23 {
return t, nil, unstable.NewParserError(b[0:2], base, "hour cannot be greater 23") return t, nil, unstable.NewParserError(b[0:2], "hour cannot be greater 23")
} }
if b[2] != ':' { if b[2] != ':' {
return t, nil, unstable.NewParserError(b[2:3], base+2, "expecting colon between hours and minutes") return t, nil, unstable.NewParserError(b[2:3], "expecting colon between hours and minutes")
} }
t.Minute, err = parseDecimalDigits(b[3:5], base+3) t.Minute, err = parseDecimalDigits(b[3:5])
if err != nil { if err != nil {
return t, nil, err return t, nil, err
} }
if t.Minute > 59 { if t.Minute > 59 {
return t, nil, unstable.NewParserError(b[3:5], base+3, "minutes cannot be greater 59") return t, nil, unstable.NewParserError(b[3:5], "minutes cannot be greater 59")
} }
if b[5] != ':' { if b[5] != ':' {
return t, nil, unstable.NewParserError(b[5:6], base+5, "expecting colon between minutes and seconds") return t, nil, unstable.NewParserError(b[5:6], "expecting colon between minutes and seconds")
} }
t.Second, err = parseDecimalDigits(b[6:8], base+6) t.Second, err = parseDecimalDigits(b[6:8])
if err != nil { if err != nil {
return t, nil, err return t, nil, err
} }
if t.Second > 59 { if t.Second > 59 {
return t, nil, unstable.NewParserError(b[6:8], base+6, "seconds cannot be greater than 59") return t, nil, unstable.NewParserError(b[6:8], "seconds cannot be greater than 59")
} }
b = b[8:] b = b[8:]
base += 8
if len(b) >= 1 && b[0] == '.' { if len(b) >= 1 && b[0] == '.' {
frac := 0 frac := 0
@@ -237,7 +244,7 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
for i, c := range b[1:] { for i, c := range b[1:] {
if !isDigit(c) { if !isDigit(c) {
if i == 0 { if i == 0 {
return t, nil, unstable.NewParserError(b[0:1], base, "need at least one digit after fraction point") return t, nil, unstable.NewParserError(b[0:1], "need at least one digit after fraction point")
} }
break break
} }
@@ -245,6 +252,13 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
const maxFracPrecision = 9 const maxFracPrecision = 9
if i >= maxFracPrecision { if i >= maxFracPrecision {
// go-toml allows decoding fractional seconds
// beyond the supported precision of 9
// digits. It truncates the fractional component
// to the supported precision and ignores the
// remaining digits.
//
// https://github.com/pelletier/go-toml/discussions/707
continue continue
} }
@@ -254,7 +268,7 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
} }
if precision == 0 { if precision == 0 {
return t, nil, unstable.NewParserError(b[:1], base, "nanoseconds need at least one digit") return t, nil, unstable.NewParserError(b[:1], "nanoseconds need at least one digit")
} }
t.Nanosecond = frac * nspow[precision] t.Nanosecond = frac * nspow[precision]
@@ -265,35 +279,35 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
return t, b, nil return t, b, nil
} }
func parseFloat(b []byte, base int) (float64, error) { func parseFloat(b []byte) (float64, error) {
if len(b) == 4 && (b[0] == '+' || b[0] == '-') && b[1] == 'n' && b[2] == 'a' && b[3] == 'n' { if len(b) == 4 && (b[0] == '+' || b[0] == '-') && b[1] == 'n' && b[2] == 'a' && b[3] == 'n' {
return math.NaN(), nil return math.NaN(), nil
} }
cleaned, err := checkAndRemoveUnderscoresFloats(b, base) cleaned, err := checkAndRemoveUnderscoresFloats(b)
if err != nil { if err != nil {
return 0, err return 0, err
} }
if cleaned[0] == '.' { if cleaned[0] == '.' {
return 0, unstable.NewParserError(b, base, "float cannot start with a dot") return 0, unstable.NewParserError(b, "float cannot start with a dot")
} }
if cleaned[len(cleaned)-1] == '.' { if cleaned[len(cleaned)-1] == '.' {
return 0, unstable.NewParserError(b, base, "float cannot end with a dot") return 0, unstable.NewParserError(b, "float cannot end with a dot")
} }
dotAlreadySeen := false dotAlreadySeen := false
for i, c := range cleaned { for i, c := range cleaned {
if c == '.' { if c == '.' {
if dotAlreadySeen { if dotAlreadySeen {
return 0, unstable.NewParserError(b[i:i+1], base+i, "float can have at most one decimal point") return 0, unstable.NewParserError(b[i:i+1], "float can have at most one decimal point")
} }
if !isDigit(cleaned[i-1]) { if !isDigit(cleaned[i-1]) {
return 0, unstable.NewParserError(b[i-1:i+1], base+i-1, "float decimal point must be preceded by a digit") return 0, unstable.NewParserError(b[i-1:i+1], "float decimal point must be preceded by a digit")
} }
if !isDigit(cleaned[i+1]) { if !isDigit(cleaned[i+1]) {
return 0, unstable.NewParserError(b[i:i+2], base+i, "float decimal point must be followed by a digit") return 0, unstable.NewParserError(b[i:i+2], "float decimal point must be followed by a digit")
} }
dotAlreadySeen = true dotAlreadySeen = true
} }
@@ -304,54 +318,54 @@ func parseFloat(b []byte, base int) (float64, error) {
start = 1 start = 1
} }
if cleaned[start] == '0' && len(cleaned) > start+1 && isDigit(cleaned[start+1]) { if cleaned[start] == '0' && len(cleaned) > start+1 && isDigit(cleaned[start+1]) {
return 0, unstable.NewParserError(b, base, "float integer part cannot have leading zeroes") return 0, unstable.NewParserError(b, "float integer part cannot have leading zeroes")
} }
f, err := strconv.ParseFloat(string(cleaned), 64) f, err := strconv.ParseFloat(string(cleaned), 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, base, "unable to parse float: %w", err) return 0, unstable.NewParserError(b, "unable to parse float: %w", err)
} }
return f, nil return f, nil
} }
func parseIntHex(b []byte, base int) (int64, error) { func parseIntHex(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2) cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:])
if err != nil { if err != nil {
return 0, err return 0, err
} }
i, err := strconv.ParseInt(string(cleaned), 16, 64) i, err := strconv.ParseInt(string(cleaned), 16, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse hexadecimal number: %w", err) return 0, unstable.NewParserError(b, "couldn't parse hexadecimal number: %w", err)
} }
return i, nil return i, nil
} }
func parseIntOct(b []byte, base int) (int64, error) { func parseIntOct(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2) cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:])
if err != nil { if err != nil {
return 0, err return 0, err
} }
i, err := strconv.ParseInt(string(cleaned), 8, 64) i, err := strconv.ParseInt(string(cleaned), 8, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse octal number: %w", err) return 0, unstable.NewParserError(b, "couldn't parse octal number: %w", err)
} }
return i, nil return i, nil
} }
func parseIntBin(b []byte, base int) (int64, error) { func parseIntBin(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2) cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:])
if err != nil { if err != nil {
return 0, err return 0, err
} }
i, err := strconv.ParseInt(string(cleaned), 2, 64) i, err := strconv.ParseInt(string(cleaned), 2, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse binary number: %w", err) return 0, unstable.NewParserError(b, "couldn't parse binary number: %w", err)
} }
return i, nil return i, nil
@@ -361,8 +375,8 @@ func isSign(b byte) bool {
return b == '+' || b == '-' return b == '+' || b == '-'
} }
func parseIntDec(b []byte, base int) (int64, error) { func parseIntDec(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b, base) cleaned, err := checkAndRemoveUnderscoresIntegers(b)
if err != nil { if err != nil {
return 0, err return 0, err
} }
@@ -374,18 +388,18 @@ func parseIntDec(b []byte, base int) (int64, error) {
} }
if len(cleaned) > startIdx+1 && cleaned[startIdx] == '0' { if len(cleaned) > startIdx+1 && cleaned[startIdx] == '0' {
return 0, unstable.NewParserError(b, base, "leading zero not allowed on decimal number") return 0, unstable.NewParserError(b, "leading zero not allowed on decimal number")
} }
i, err := strconv.ParseInt(string(cleaned), 10, 64) i, err := strconv.ParseInt(string(cleaned), 10, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse decimal number: %w", err) return 0, unstable.NewParserError(b, "couldn't parse decimal number: %w", err)
} }
return i, nil return i, nil
} }
func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) { func checkAndRemoveUnderscoresIntegers(b []byte) ([]byte, error) {
start := 0 start := 0
if b[start] == '+' || b[start] == '-' { if b[start] == '+' || b[start] == '-' {
start++ start++
@@ -396,11 +410,11 @@ func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
} }
if b[start] == '_' { if b[start] == '_' {
return nil, unstable.NewParserError(b[start:start+1], base+start, "number cannot start with underscore") return nil, unstable.NewParserError(b[start:start+1], "number cannot start with underscore")
} }
if b[len(b)-1] == '_' { if b[len(b)-1] == '_' {
return nil, unstable.NewParserError(b[len(b)-1:], base+len(b)-1, "number cannot end with underscore") return nil, unstable.NewParserError(b[len(b)-1:], "number cannot end with underscore")
} }
// fast path // fast path
@@ -422,7 +436,7 @@ func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
c := b[i] c := b[i]
if c == '_' { if c == '_' {
if !before { if !before {
return nil, unstable.NewParserError(b[i-1:i+1], base+i-1, "number must have at least one digit between underscores") return nil, unstable.NewParserError(b[i-1:i+1], "number must have at least one digit between underscores")
} }
before = false before = false
} else { } else {
@@ -434,13 +448,13 @@ func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
return cleaned, nil return cleaned, nil
} }
func checkAndRemoveUnderscoresFloats(b []byte, base int) ([]byte, error) { func checkAndRemoveUnderscoresFloats(b []byte) ([]byte, error) {
if b[0] == '_' { if b[0] == '_' {
return nil, unstable.NewParserError(b[0:1], base, "number cannot start with underscore") return nil, unstable.NewParserError(b[0:1], "number cannot start with underscore")
} }
if b[len(b)-1] == '_' { if b[len(b)-1] == '_' {
return nil, unstable.NewParserError(b[len(b)-1:], base+len(b)-1, "number cannot end with underscore") return nil, unstable.NewParserError(b[len(b)-1:], "number cannot end with underscore")
} }
// fast path // fast path
@@ -463,26 +477,27 @@ func checkAndRemoveUnderscoresFloats(b []byte, base int) ([]byte, error) {
switch c { switch c {
case '_': case '_':
if !before { if !before {
return nil, unstable.NewParserError(b[i-1:i+1], base+i-1, "number must have at least one digit between underscores") return nil, unstable.NewParserError(b[i-1:i+1], "number must have at least one digit between underscores")
} }
if i < len(b)-1 && (b[i+1] == 'e' || b[i+1] == 'E') { if i < len(b)-1 && (b[i+1] == 'e' || b[i+1] == 'E') {
return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore before exponent") return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore before exponent")
} }
before = false before = false
case '+', '-': case '+', '-':
// signed exponents
cleaned = append(cleaned, c) cleaned = append(cleaned, c)
before = false before = false
case 'e', 'E': case 'e', 'E':
if i < len(b)-1 && b[i+1] == '_' { if i < len(b)-1 && b[i+1] == '_' {
return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore after exponent") return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore after exponent")
} }
cleaned = append(cleaned, c) cleaned = append(cleaned, c)
case '.': case '.':
if i < len(b)-1 && b[i+1] == '_' { if i < len(b)-1 && b[i+1] == '_' {
return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore after decimal point") return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore after decimal point")
} }
if i > 0 && b[i-1] == '_' { if i > 0 && b[i-1] == '_' {
return nil, unstable.NewParserError(b[i-1:i], base+i-1, "cannot have underscore before decimal point") return nil, unstable.NewParserError(b[i-1:i], "cannot have underscore before decimal point")
} }
cleaned = append(cleaned, c) cleaned = append(cleaned, c)
default: default:
+21 -1
View File
@@ -2,6 +2,7 @@ package toml
import ( import (
"fmt" "fmt"
"reflect"
"strconv" "strconv"
"strings" "strings"
@@ -99,7 +100,7 @@ func (e *DecodeError) Key() Key {
// //
//nolint:funlen //nolint:funlen
func wrapDecodeError(document []byte, de *unstable.ParserError) *DecodeError { func wrapDecodeError(document []byte, de *unstable.ParserError) *DecodeError {
offset := de.Offset offset := subsliceOffset(document, de.Highlight)
errMessage := de.Error() errMessage := de.Error()
errLine, errColumn := positionAtEnd(document[:offset]) errLine, errColumn := positionAtEnd(document[:offset])
@@ -261,3 +262,22 @@ func positionAtEnd(b []byte) (row int, column int) {
return row, column return row, column
} }
// subsliceOffset returns the byte offset of subslice within data.
// subslice must share the same backing array as data.
func subsliceOffset(data []byte, subslice []byte) int {
if len(subslice) == 0 {
return 0
}
// Use reflect to get the data pointers of both slices.
// This is safe because we're only reading the pointer values for comparison.
dataPtr := reflect.ValueOf(data).Pointer()
subPtr := reflect.ValueOf(subslice).Pointer()
offset := int(subPtr - dataPtr)
if offset < 0 || offset > len(data) {
panic("subslice is not within data")
}
return offset
}
-101
View File
@@ -171,7 +171,6 @@ line 5`,
err := wrapDecodeError(doc, &unstable.ParserError{ err := wrapDecodeError(doc, &unstable.ParserError{
Highlight: hl, Highlight: hl,
Offset: start,
Message: e.msg, Message: e.msg,
}) })
@@ -260,12 +259,6 @@ func TestDecodeError_Position(t *testing.T) {
expectedRow: 3, expectedRow: 3,
minCol: 5, minCol: 5,
}, },
{
name: "missing equals on last line without trailing newline",
doc: "a = 1\nb = 2\nc",
expectedRow: 3,
minCol: 1,
},
} }
for _, e := range examples { for _, e := range examples {
@@ -287,100 +280,6 @@ func TestDecodeError_Position(t *testing.T) {
} }
} }
func TestDecodeError_PositionAfterComments(t *testing.T) {
examples := []struct {
name string
doc string
expectedRow int
expectedCol int
errContains string
}{
{
name: "invalid key after comment",
doc: "# comment\n= \"value\"",
expectedRow: 2,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key after multiple comments",
doc: "# line 1\n# line 2\n= \"value\"",
expectedRow: 3,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key after valid assignment and comment",
doc: "a = 1\n# comment\n= \"value\"",
expectedRow: 3,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key on first line",
doc: "= \"value\"",
expectedRow: 1,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key with leading whitespace",
doc: "# comment\n = \"value\"",
expectedRow: 2,
expectedCol: 3,
errContains: "invalid character at start of key",
},
}
for _, e := range examples {
t.Run(e.name, func(t *testing.T) {
var v map[string]interface{}
err := Unmarshal([]byte(e.doc), &v)
if err == nil {
t.Fatal("expected an error")
}
var derr *DecodeError
if !errors.As(err, &derr) {
t.Fatalf("expected DecodeError, got %T: %v", err, err)
}
row, col := derr.Position()
if row != e.expectedRow {
t.Errorf("row: got %d, want %d (error: %s)", row, e.expectedRow, derr.String())
}
if col != e.expectedCol {
t.Errorf("col: got %d, want %d (error: %s)", col, e.expectedCol, derr.String())
}
if !strings.Contains(derr.Error(), e.errContains) {
t.Errorf("error %q does not contain %q", derr.Error(), e.errContains)
}
})
}
}
func TestDecodeError_HumanStringAfterComments(t *testing.T) {
doc := "# comment\n= \"value\""
var v map[string]interface{}
err := Unmarshal([]byte(doc), &v)
if err == nil {
t.Fatal("expected an error")
}
var derr *DecodeError
if !errors.As(err, &derr) {
t.Fatalf("expected DecodeError, got %T: %v", err, err)
}
human := derr.String()
if !strings.Contains(human, "= \"value\"") {
t.Errorf("human-readable error should show the offending line, got:\n%s", human)
}
if !strings.Contains(human, "2|") {
t.Errorf("human-readable error should reference line 2, got:\n%s", human)
}
}
func TestStrictErrorUnwrap(t *testing.T) { func TestStrictErrorUnwrap(t *testing.T) {
fo := bytes.NewBufferString(` fo := bytes.NewBufferString(`
Missing = 1 Missing = 1
+16 -12
View File
@@ -24,57 +24,61 @@ import (
// 0x9 => tab, ok // 0x9 => tab, ok
// 0xA - 0x1F => invalid // 0xA - 0x1F => invalid
// 0x7F => invalid // 0x7F => invalid
func Utf8TomlValidAlreadyEscaped(p []byte) int { func Utf8TomlValidAlreadyEscaped(p []byte) []byte {
consumed := 0
// Fast path. Check for and skip 8 bytes of ASCII characters per iteration. // Fast path. Check for and skip 8 bytes of ASCII characters per iteration.
for len(p) >= 8 { for len(p) >= 8 {
// Combining two 32 bit loads allows the same code to be used
// for 32 and 64 bit platforms.
// The compiler can generate a 32bit load for first32 and second32
// on many platforms. See test/codegen/memcombine.go.
first32 := uint32(p[0]) | uint32(p[1])<<8 | uint32(p[2])<<16 | uint32(p[3])<<24 first32 := uint32(p[0]) | uint32(p[1])<<8 | uint32(p[2])<<16 | uint32(p[3])<<24
second32 := uint32(p[4]) | uint32(p[5])<<8 | uint32(p[6])<<16 | uint32(p[7])<<24 second32 := uint32(p[4]) | uint32(p[5])<<8 | uint32(p[6])<<16 | uint32(p[7])<<24
if (first32|second32)&0x80808080 != 0 { if (first32|second32)&0x80808080 != 0 {
// Found a non ASCII byte (>= RuneSelf).
break break
} }
for i, b := range p[:8] { for i, b := range p[:8] {
if InvalidASCII(b) { if InvalidASCII(b) {
return consumed + i return p[i : i+1]
} }
} }
p = p[8:] p = p[8:]
consumed += 8
} }
n := len(p) n := len(p)
for i := 0; i < n; { for i := 0; i < n; {
pi := p[i] pi := p[i]
if pi < utf8.RuneSelf { if pi < utf8.RuneSelf {
if InvalidASCII(pi) { if InvalidASCII(pi) {
return consumed + i return p[i : i+1]
} }
i++ i++
continue continue
} }
x := first[pi] x := first[pi]
if x == xx { if x == xx {
return consumed + i // Illegal starter byte.
return p[i : i+1]
} }
size := int(x & 7) size := int(x & 7)
if i+size > n { if i+size > n {
return consumed + i // Short or invalid.
return p[i:n]
} }
accept := acceptRanges[x>>4] accept := acceptRanges[x>>4]
if c := p[i+1]; c < accept.lo || accept.hi < c { if c := p[i+1]; c < accept.lo || accept.hi < c {
return consumed + i return p[i : i+2]
} else if size == 2 { //revive:disable:empty-block } else if size == 2 { //revive:disable:empty-block
} else if c := p[i+2]; c < locb || hicb < c { } else if c := p[i+2]; c < locb || hicb < c {
return consumed + i return p[i : i+3]
} else if size == 3 { //revive:disable:empty-block } else if size == 3 { //revive:disable:empty-block
} else if c := p[i+3]; c < locb || hicb < c { } else if c := p[i+3]; c < locb || hicb < c {
return consumed + i return p[i : i+4]
} }
i += size i += size
} }
return -1 return nil
} }
// Utf8ValidNext returns the size of the next rune if valid, 0 otherwise. // Utf8ValidNext returns the size of the next rune if valid, 0 otherwise.
+5 -5
View File
@@ -32,7 +32,7 @@ func (d LocalDate) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d. // UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalDate) UnmarshalText(b []byte) error { func (d *LocalDate) UnmarshalText(b []byte) error {
res, err := parseLocalDate(b, 0) res, err := parseLocalDate(b)
if err != nil { if err != nil {
return err return err
} }
@@ -75,9 +75,9 @@ func (d LocalTime) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d. // UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalTime) UnmarshalText(b []byte) error { func (d *LocalTime) UnmarshalText(b []byte) error {
res, left, err := parseLocalTime(b, 0) res, left, err := parseLocalTime(b)
if err == nil && len(left) != 0 { if err == nil && len(left) != 0 {
err = unstable.NewParserError(left, len(b)-len(left), "extra characters") err = unstable.NewParserError(left, "extra characters")
} }
if err != nil { if err != nil {
return err return err
@@ -109,9 +109,9 @@ func (d LocalDateTime) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d. // UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalDateTime) UnmarshalText(data []byte) error { func (d *LocalDateTime) UnmarshalText(data []byte) error {
res, left, err := parseLocalDateTime(data, 0) res, left, err := parseLocalDateTime(data)
if err == nil && len(left) != 0 { if err == nil && len(left) != 0 {
err = unstable.NewParserError(left, len(data)-len(left), "extra characters") err = unstable.NewParserError(left, "extra characters")
} }
if err != nil { if err != nil {
return err return err
+9 -12
View File
@@ -704,18 +704,15 @@ func (enc *Encoder) encodeMap(b []byte, ctx encoderCtx, v reflect.Value) ([]byte
for iter.Next() { for iter.Next() {
v := iter.Value() v := iter.Value()
// Handle nil values: convert nil pointers to zero value, if isNil(v) {
// skip nil interfaces and nil maps. // For nil pointers, convert to zero value of the element type.
switch v.Kind() { // This allows round-trip marshaling of maps with nil pointer values.
case reflect.Ptr: // For nil interfaces and nil maps, skip since we can't derive a type.
if v.IsNil() { if v.Kind() == reflect.Ptr {
v = reflect.Zero(v.Type().Elem()) v = reflect.Zero(v.Type().Elem())
} } else {
case reflect.Interface, reflect.Map:
if v.IsNil() {
continue continue
} }
default:
} }
k, err := enc.keyToString(iter.Key()) k, err := enc.keyToString(iter.Key())
@@ -939,7 +936,7 @@ func (enc *Encoder) encodeTable(b []byte, ctx encoderCtx, t table) ([]byte, erro
if shouldOmitEmpty(kv.Options, kv.Value) { if shouldOmitEmpty(kv.Options, kv.Value) {
continue continue
} }
if kv.Options.omitzero && shouldOmitZero(kv.Options, kv.Value) { if shouldOmitZero(kv.Options, kv.Value) {
continue continue
} }
hasNonEmptyKV = true hasNonEmptyKV = true
@@ -961,7 +958,7 @@ func (enc *Encoder) encodeTable(b []byte, ctx encoderCtx, t table) ([]byte, erro
if shouldOmitEmpty(table.Options, table.Value) { if shouldOmitEmpty(table.Options, table.Value) {
continue continue
} }
if table.Options.omitzero && shouldOmitZero(table.Options, table.Value) { if shouldOmitZero(table.Options, table.Value) {
continue continue
} }
if first { if first {
@@ -998,7 +995,7 @@ func (enc *Encoder) encodeTableInline(b []byte, ctx encoderCtx, t table) ([]byte
if shouldOmitEmpty(kv.Options, kv.Value) { if shouldOmitEmpty(kv.Options, kv.Value) {
continue continue
} }
if kv.Options.omitzero && shouldOmitZero(kv.Options, kv.Value) { if shouldOmitZero(kv.Options, kv.Value) {
continue continue
} }
+6 -8
View File
@@ -54,12 +54,10 @@ func (s *strict) MissingTable(node *unstable.Node) {
return return
} }
loc, offset := s.keyLocation(node)
s.missing = append(s.missing, unstable.ParserError{ s.missing = append(s.missing, unstable.ParserError{
Highlight: loc, Highlight: s.keyLocation(node),
Message: "missing table", Message: "missing table",
Key: s.key.Key(), Key: s.key.Key(),
Offset: offset,
}) })
} }
@@ -68,12 +66,10 @@ func (s *strict) MissingField(node *unstable.Node) {
return return
} }
loc, offset := s.keyLocation(node)
s.missing = append(s.missing, unstable.ParserError{ s.missing = append(s.missing, unstable.ParserError{
Highlight: loc, Highlight: s.keyLocation(node),
Message: "missing field", Message: "missing field",
Key: s.key.Key(), Key: s.key.Key(),
Offset: offset,
}) })
} }
@@ -94,7 +90,7 @@ func (s *strict) Error(doc []byte) error {
return err return err
} }
func (s *strict) keyLocation(node *unstable.Node) ([]byte, int) { func (s *strict) keyLocation(node *unstable.Node) []byte {
k := node.Key() k := node.Key()
hasOne := k.Next() hasOne := k.Next()
@@ -102,6 +98,7 @@ func (s *strict) keyLocation(node *unstable.Node) ([]byte, int) {
panic("should not be called with empty key") panic("should not be called with empty key")
} }
// Get the range from the first key to the last key.
firstRaw := k.Node().Raw firstRaw := k.Node().Raw
lastRaw := firstRaw lastRaw := firstRaw
@@ -109,8 +106,9 @@ func (s *strict) keyLocation(node *unstable.Node) ([]byte, int) {
lastRaw = k.Node().Raw lastRaw = k.Node().Raw
} }
// Compute the slice from the document using the ranges.
start := firstRaw.Offset start := firstRaw.Offset
end := lastRaw.Offset + lastRaw.Length end := lastRaw.Offset + lastRaw.Length
return s.doc[start:end], int(start) return s.doc[start:end]
} }
+4 -5
View File
@@ -9,7 +9,7 @@ YELLOW='\033[1;33m'
BLUE='\033[0;34m' BLUE='\033[0;34m'
NC='\033[0m' # No Color NC='\033[0m' # No Color
# Go versions to test (1.11 through 1.26) # Go versions to test (1.11 through 1.25)
GO_VERSIONS=( GO_VERSIONS=(
"1.11" "1.11"
"1.12" "1.12"
@@ -26,7 +26,6 @@ GO_VERSIONS=(
"1.23" "1.23"
"1.24" "1.24"
"1.25" "1.25"
"1.26"
) )
# Default values # Default values
@@ -65,7 +64,7 @@ EXAMPLES:
$0 # Test all Go versions in parallel $0 # Test all Go versions in parallel
$0 --sequential # Test all Go versions sequentially $0 --sequential # Test all Go versions sequentially
$0 1.21 1.22 1.23 # Test specific versions $0 1.21 1.22 1.23 # Test specific versions
$0 --verbose --output ./results 1.25 1.26 # Verbose output to custom directory $0 --verbose --output ./results 1.24 1.25 # Verbose output to custom directory
EXIT CODES: EXIT CODES:
0 Recent Go versions pass (good compatibility) 0 Recent Go versions pass (good compatibility)
@@ -137,8 +136,8 @@ fi
# Validate Go versions # Validate Go versions
for version in "${GO_VERSIONS[@]}"; do for version in "${GO_VERSIONS[@]}"; do
if ! [[ "$version" =~ ^1\.(1[1-9]|2[0-6])$ ]]; then if ! [[ "$version" =~ ^1\.(1[1-9]|2[0-5])$ ]]; then
log_error "Invalid Go version: $version. Supported versions: 1.11-1.26" log_error "Invalid Go version: $version. Supported versions: 1.11-1.25"
exit 1 exit 1
fi fi
done done
+20 -20
View File
@@ -625,7 +625,7 @@ func (d *decoder) handleTable(key unstable.Iterator, v reflect.Value) (reflect.V
} }
} }
} }
return reflect.Value{}, unstable.NewParserError(key.Node().Data, int(key.Node().Raw.Offset), "cannot store a table in a slice") return reflect.Value{}, unstable.NewParserError(key.Node().Data, "cannot store a table in a slice")
} }
if key.Next() { if key.Next() {
// Still scoping the key // Still scoping the key
@@ -748,7 +748,7 @@ func (d *decoder) tryTextUnmarshaler(node *unstable.Node, v reflect.Value) (bool
if v.CanAddr() && v.Addr().Type().Implements(textUnmarshalerType) { if v.CanAddr() && v.Addr().Type().Implements(textUnmarshalerType) {
err := v.Addr().Interface().(encoding.TextUnmarshaler).UnmarshalText(node.Data) err := v.Addr().Interface().(encoding.TextUnmarshaler).UnmarshalText(node.Data)
if err != nil { if err != nil {
return false, unstable.NewParserError(d.p.Raw(node.Raw), int(node.Raw.Offset), "%w", err) return false, unstable.NewParserError(d.p.Raw(node.Raw), "%w", err)
} }
return true, nil return true, nil
@@ -896,7 +896,7 @@ func (d *decoder) unmarshalInlineTable(itable *unstable.Node, v reflect.Value) e
} }
return d.unmarshalInlineTable(itable, elem) return d.unmarshalInlineTable(itable, elem)
default: default:
return unstable.NewParserError(d.p.Raw(itable.Raw), int(itable.Raw.Offset), "cannot store inline table in Go type %s", v.Kind()) return unstable.NewParserError(d.p.Raw(itable.Raw), "cannot store inline table in Go type %s", v.Kind())
} }
it := itable.Children() it := itable.Children()
@@ -916,26 +916,26 @@ func (d *decoder) unmarshalInlineTable(itable *unstable.Node, v reflect.Value) e
} }
func (d *decoder) unmarshalDateTime(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalDateTime(value *unstable.Node, v reflect.Value) error {
dt, err := parseDateTime(value.Data, int(value.Raw.Offset)) dt, err := parseDateTime(value.Data)
if err != nil { if err != nil {
return err return err
} }
if v.Kind() != reflect.Interface && v.Type() != timeType { if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("datetime", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("datetime", v.Type()))
} }
v.Set(reflect.ValueOf(dt)) v.Set(reflect.ValueOf(dt))
return nil return nil
} }
func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) error {
ld, err := parseLocalDate(value.Data, int(value.Raw.Offset)) ld, err := parseLocalDate(value.Data)
if err != nil { if err != nil {
return err return err
} }
if v.Kind() != reflect.Interface && v.Type() != timeType { if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local date", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local date", v.Type()))
} }
if v.Type() == timeType { if v.Type() == timeType {
v.Set(reflect.ValueOf(ld.AsTime(time.Local))) v.Set(reflect.ValueOf(ld.AsTime(time.Local)))
@@ -946,34 +946,34 @@ func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) erro
} }
func (d *decoder) unmarshalLocalTime(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalLocalTime(value *unstable.Node, v reflect.Value) error {
lt, rest, err := parseLocalTime(value.Data, int(value.Raw.Offset)) lt, rest, err := parseLocalTime(value.Data)
if err != nil { if err != nil {
return err return err
} }
if len(rest) > 0 { if len(rest) > 0 {
return unstable.NewParserError(rest, int(value.Raw.Offset)+len(value.Data)-len(rest), "extra characters at the end of a local time") return unstable.NewParserError(rest, "extra characters at the end of a local time")
} }
if v.Kind() != reflect.Interface { if v.Kind() != reflect.Interface {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local time", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local time", v.Type()))
} }
v.Set(reflect.ValueOf(lt)) v.Set(reflect.ValueOf(lt))
return nil return nil
} }
func (d *decoder) unmarshalLocalDateTime(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalLocalDateTime(value *unstable.Node, v reflect.Value) error {
ldt, rest, err := parseLocalDateTime(value.Data, int(value.Raw.Offset)) ldt, rest, err := parseLocalDateTime(value.Data)
if err != nil { if err != nil {
return err return err
} }
if len(rest) > 0 { if len(rest) > 0 {
return unstable.NewParserError(rest, int(value.Raw.Offset)+len(value.Data)-len(rest), "extra characters at the end of a local date time") return unstable.NewParserError(rest, "extra characters at the end of a local date time")
} }
if v.Kind() != reflect.Interface && v.Type() != timeType { if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local datetime", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local datetime", v.Type()))
} }
if v.Type() == timeType { if v.Type() == timeType {
v.Set(reflect.ValueOf(ldt.AsTime(time.Local))) v.Set(reflect.ValueOf(ldt.AsTime(time.Local)))
@@ -992,14 +992,14 @@ func (d *decoder) unmarshalBool(value *unstable.Node, v reflect.Value) error {
case reflect.Interface: case reflect.Interface:
v.Set(reflect.ValueOf(b)) v.Set(reflect.ValueOf(b))
default: default:
return unstable.NewParserError(value.Data, int(value.Raw.Offset), "cannot assign boolean to a %t", b) return unstable.NewParserError(value.Data, "cannot assign boolean to a %t", b)
} }
return nil return nil
} }
func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error {
f, err := parseFloat(value.Data, int(value.Raw.Offset)) f, err := parseFloat(value.Data)
if err != nil { if err != nil {
return err return err
} }
@@ -1009,13 +1009,13 @@ func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error {
v.SetFloat(f) v.SetFloat(f)
case reflect.Float32: case reflect.Float32:
if f > math.MaxFloat32 { if f > math.MaxFloat32 {
return unstable.NewParserError(value.Data, int(value.Raw.Offset), "number %f does not fit in a float32", f) return unstable.NewParserError(value.Data, "number %f does not fit in a float32", f)
} }
v.SetFloat(f) v.SetFloat(f)
case reflect.Interface: case reflect.Interface:
v.Set(reflect.ValueOf(f)) v.Set(reflect.ValueOf(f))
default: default:
return unstable.NewParserError(value.Data, int(value.Raw.Offset), "float cannot be assigned to %s", v.Kind()) return unstable.NewParserError(value.Data, "float cannot be assigned to %s", v.Kind())
} }
return nil return nil
@@ -1048,7 +1048,7 @@ func (d *decoder) unmarshalInteger(value *unstable.Node, v reflect.Value) error
return d.unmarshalFloat(value, v) return d.unmarshalFloat(value, v)
} }
i, err := parseInteger(value.Data, int(value.Raw.Offset)) i, err := parseInteger(value.Data)
if err != nil { if err != nil {
return err return err
} }
@@ -1116,7 +1116,7 @@ func (d *decoder) unmarshalInteger(value *unstable.Node, v reflect.Value) error
case reflect.Interface: case reflect.Interface:
r = reflect.ValueOf(i) r = reflect.ValueOf(i)
default: default:
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("integer", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("integer", v.Type()))
} }
if !r.Type().AssignableTo(v.Type()) { if !r.Type().AssignableTo(v.Type()) {
@@ -1135,7 +1135,7 @@ func (d *decoder) unmarshalString(value *unstable.Node, v reflect.Value) error {
case reflect.Interface: case reflect.Interface:
v.Set(reflect.ValueOf(string(value.Data))) v.Set(reflect.ValueOf(string(value.Data)))
default: default:
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("string", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("string", v.Type()))
} }
return nil return nil
+257 -1
View File
@@ -4554,10 +4554,266 @@ func (c *customTable873) UnmarshalTOML(data []byte) error {
c.Keys = append(c.Keys, key) c.Keys = append(c.Keys, key)
c.Values[key] = string(valueBytes) c.Values[key] = string(valueBytes)
} }
return nil return nil
} }
func TestIssue873_TableUnmarshaler(t *testing.T) {
type Config struct {
Section customTable873 `toml:"section"`
}
doc := `
[section]
key1 = "value1"
key2 = "value2"
key3 = "value3"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, []string{"key1", "key2", "key3"}, cfg.Section.Keys)
assert.Equal(t, "value1", cfg.Section.Values["key1"])
assert.Equal(t, "value2", cfg.Section.Values["key2"])
assert.Equal(t, "value3", cfg.Section.Values["key3"])
}
func TestIssue873_TableUnmarshaler_EmptyTable(t *testing.T) {
type Config struct {
Section customTable873 `toml:"section"`
}
doc := `
[section]
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, []string{}, cfg.Section.Keys)
}
func TestIssue873_ArrayTableUnmarshaler(t *testing.T) {
type Config struct {
Items []customTable873 `toml:"items"`
}
doc := `
[[items]]
name = "first"
id = "1"
[[items]]
name = "second"
id = "2"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, 2, len(cfg.Items))
assert.Equal(t, []string{"name", "id"}, cfg.Items[0].Keys)
assert.Equal(t, "first", cfg.Items[0].Values["name"])
assert.Equal(t, "1", cfg.Items[0].Values["id"])
assert.Equal(t, []string{"name", "id"}, cfg.Items[1].Keys)
assert.Equal(t, "second", cfg.Items[1].Values["name"])
assert.Equal(t, "2", cfg.Items[1].Values["id"])
}
func TestIssue873_NestedTableUnmarshaler(t *testing.T) {
type Config struct {
Outer struct {
Inner customTable873 `toml:"inner"`
} `toml:"outer"`
}
doc := `
[outer.inner]
a = "A"
b = "B"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, []string{"a", "b"}, cfg.Outer.Inner.Keys)
assert.Equal(t, "A", cfg.Outer.Inner.Values["a"])
assert.Equal(t, "B", cfg.Outer.Inner.Values["b"])
}
func TestIssue873_TableUnmarshaler_MultipleTables(t *testing.T) {
type Config struct {
First customTable873 `toml:"first"`
Second customTable873 `toml:"second"`
}
doc := `
[first]
key1 = "value1"
[second]
key2 = "value2"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, []string{"key1"}, cfg.First.Keys)
assert.Equal(t, "value1", cfg.First.Values["key1"])
assert.Equal(t, []string{"key2"}, cfg.Second.Keys)
assert.Equal(t, "value2", cfg.Second.Values["key2"])
}
// Test that regular struct fields still work alongside Unmarshaler tables
func TestIssue873_MixedWithRegularFields(t *testing.T) {
type Config struct {
Name string `toml:"name"`
Section customTable873 `toml:"section"`
Count int `toml:"count"`
}
doc := `
name = "test"
count = 42
[section]
foo = "bar"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, "test", cfg.Name)
assert.Equal(t, 42, cfg.Count)
assert.Equal(t, []string{"foo"}, cfg.Section.Keys)
assert.Equal(t, "bar", cfg.Section.Values["foo"])
}
// Test that pointer to Unmarshaler type works
func TestIssue873_PointerToUnmarshaler(t *testing.T) {
type Config struct {
Section *customTable873 `toml:"section"`
}
doc := `
[section]
hello = "world"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.True(t, cfg.Section != nil)
assert.Equal(t, []string{"hello"}, cfg.Section.Keys)
assert.Equal(t, "world", cfg.Section.Values["hello"])
}
// Test table with sub-tables defined separately
func TestIssue873_TableWithSubTables(t *testing.T) {
type Config struct {
Parent customTable873 `toml:"parent"`
}
doc := `
[parent]
name = "root"
[parent.child]
name = "nested"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
// The parent should only get the keys directly under [parent],
// not the [parent.child] sub-table
assert.NoError(t, err)
assert.Equal(t, []string{"name"}, cfg.Parent.Keys)
assert.Equal(t, "root", cfg.Parent.Values["name"])
}
// Test for issue #994 follow-up - tables defined piece-wise
// This addresses the maintainer's comment: "It doesn't deal with tables defined piece-wise yet"
func TestIssue994_TablesPieceWise(t *testing.T) {
// Test with piece-wise table definition (using [table] syntax)
// The customTable873 type captures key-value pairs in order,
// which is useful for use cases like maintaining map ordering
doc := `
[section]
first = "1"
second = "2"
third = "3"
`
type Config struct {
Section customTable873 `toml:"section"`
}
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
// Verify ordering is preserved (keys are collected in document order)
assert.Equal(t, []string{"first", "second", "third"}, cfg.Section.Keys)
assert.Equal(t, "1", cfg.Section.Values["first"])
assert.Equal(t, "2", cfg.Section.Values["second"])
assert.Equal(t, "3", cfg.Section.Values["third"])
}
// Test root-level struct with tables - combines #994 fix with #873 enhancement
func TestIssue994_RootWithTables(t *testing.T) {
type rootDoc struct {
Tables []customTable873 `toml:"tables"`
}
doc := `
[[tables]]
name = "first"
value = "one"
[[tables]]
name = "second"
value = "two"
`
var d rootDoc
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&d)
assert.NoError(t, err)
assert.Equal(t, 2, len(d.Tables))
assert.Equal(t, "first", d.Tables[0].Values["name"])
assert.Equal(t, "one", d.Tables[0].Values["value"])
assert.Equal(t, "second", d.Tables[1].Values["name"])
assert.Equal(t, "two", d.Tables[1].Values["value"])
}
// Test for split tables - when the same parent table is defined in multiple places // Test for split tables - when the same parent table is defined in multiple places
// This is a key requirement for issue #873: if type A implements Unmarshaler, // This is a key requirement for issue #873: if type A implements Unmarshaler,
// and [a.b] and [a.d] are defined with another table [x] in between, // and [a.b] and [a.d] are defined with another table [x] in between,
+3 -7
View File
@@ -28,16 +28,12 @@ func (c *Iterator) Next() bool {
if c.nodes == nil { if c.nodes == nil {
return false return false
} }
nodes := *c.nodes
if !c.started { if !c.started {
c.started = true c.started = true
} else { } else if c.idx >= 0 {
idx := c.idx c.idx = (*c.nodes)[c.idx].next
if idx >= 0 && int(idx) < len(nodes) {
c.idx = nodes[idx].next
}
} }
return c.idx >= 0 && int(c.idx) < len(nodes) return c.idx >= 0 && int(c.idx) < len(*c.nodes)
} }
// IsLast returns true if the current node of the iterator is the last // IsLast returns true if the current node of the iterator is the last
+1 -1
View File
@@ -35,7 +35,7 @@ func BenchmarkScanComments(b *testing.B) {
b.ResetTimer() b.ResetTimer()
for i := 0; i < b.N; i++ { for i := 0; i < b.N; i++ {
_, _, _ = scanComment(input, 0) _, _, _ = scanComment(input)
} }
}) })
} }
+72 -62
View File
@@ -16,7 +16,6 @@ type ParserError struct {
Highlight []byte Highlight []byte
Message string Message string
Key []string // optional Key []string // optional
Offset int
} }
// Error is the implementation of the error interface. // Error is the implementation of the error interface.
@@ -28,10 +27,9 @@ func (e *ParserError) Error() string {
// //
// Warning: Highlight needs to be a subslice of Parser.data, so only slices // Warning: Highlight needs to be a subslice of Parser.data, so only slices
// returned by Parser.Raw are valid candidates. // returned by Parser.Raw are valid candidates.
func NewParserError(highlight []byte, offset int, format string, args ...interface{}) error { func NewParserError(highlight []byte, format string, args ...interface{}) error {
return &ParserError{ return &ParserError{
Highlight: highlight, Highlight: highlight,
Offset: offset,
Message: fmt.Errorf(format, args...).Error(), Message: fmt.Errorf(format, args...).Error(),
} }
} }
@@ -66,8 +64,14 @@ func (p *Parser) Data() []byte {
return p.data return p.data
} }
func (p *Parser) offsetOf(b []byte) int { // Range returns a range description that corresponds to a given slice of the
return len(p.data) - len(b) // input. If the argument is not a subslice of the parser input, this function
// panics.
func (p *Parser) Range(b []byte) Range {
return Range{
Offset: uint32(p.subsliceOffset(b)), //nolint:gosec // TOML documents are small
Length: uint32(len(b)), //nolint:gosec // TOML documents are small
}
} }
// rangeOfToken computes the Range of a token given the remaining bytes after the token. // rangeOfToken computes the Range of a token given the remaining bytes after the token.
@@ -78,6 +82,13 @@ func (p *Parser) rangeOfToken(token, rest []byte) Range {
return Range{Offset: uint32(offset), Length: uint32(len(token))} //nolint:gosec // TOML documents are small return Range{Offset: uint32(offset), Length: uint32(len(token))} //nolint:gosec // TOML documents are small
} }
// subsliceOffset returns the byte offset of subslice b within p.data.
// b must be a suffix (tail) of p.data.
func (p *Parser) subsliceOffset(b []byte) int {
// b is a suffix of p.data, so its offset is len(p.data) - len(b)
return len(p.data) - len(b)
}
// Raw returns the slice corresponding to the bytes in the given range. // Raw returns the slice corresponding to the bytes in the given range.
func (p *Parser) Raw(raw Range) []byte { func (p *Parser) Raw(raw Range) []byte {
return p.data[raw.Offset : raw.Offset+raw.Length] return p.data[raw.Offset : raw.Offset+raw.Length]
@@ -187,16 +198,16 @@ func (p *Parser) parseNewline(b []byte) ([]byte, error) {
} }
if b[0] == '\r' { if b[0] == '\r' {
_, rest, err := scanWindowsNewline(b, p.offsetOf(b)) _, rest, err := scanWindowsNewline(b)
return rest, err return rest, err
} }
return nil, NewParserError(b[0:1], p.offsetOf(b), "expected newline but got %#U", b[0]) return nil, NewParserError(b[0:1], "expected newline but got %#U", b[0])
} }
func (p *Parser) parseComment(b []byte) (reference, []byte, error) { func (p *Parser) parseComment(b []byte) (reference, []byte, error) {
ref := invalidReference ref := invalidReference
data, rest, err := scanComment(b, p.offsetOf(b)) data, rest, err := scanComment(b)
if p.KeepComments && err == nil { if p.KeepComments && err == nil {
ref = p.builder.Push(Node{ ref = p.builder.Push(Node{
Kind: Comment, Kind: Comment,
@@ -280,12 +291,12 @@ func (p *Parser) parseArrayTable(b []byte) (reference, []byte, error) {
p.builder.AttachChild(ref, k) p.builder.AttachChild(ref, k)
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
b, err = expect(']', b, p.offsetOf(b)) b, err = expect(']', b)
if err != nil { if err != nil {
return ref, nil, err return ref, nil, err
} }
b, err = expect(']', b, p.offsetOf(b)) b, err = expect(']', b)
return ref, b, err return ref, b, err
} }
@@ -310,7 +321,7 @@ func (p *Parser) parseStdTable(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
b, err = expect(']', b, p.offsetOf(b)) b, err = expect(']', b)
return ref, b, err return ref, b, err
} }
@@ -334,10 +345,10 @@ func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
if len(b) == 0 { if len(b) == 0 {
return invalidReference, nil, NewParserError(startB[:len(startB)-len(b)], p.offsetOf(startB), "expected = after a key, but the document ends there") return invalidReference, nil, NewParserError(b, "expected = after a key, but the document ends there")
} }
b, err = expect('=', b, p.offsetOf(b)) b, err = expect('=', b)
if err != nil { if err != nil {
return invalidReference, nil, err return invalidReference, nil, err
} }
@@ -352,10 +363,9 @@ func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
p.builder.Chain(valRef, key) p.builder.Chain(valRef, key)
p.builder.AttachChild(ref, valRef) p.builder.AttachChild(ref, valRef)
// Set Raw to span the entire key-value expression. // Set Raw to span the entire key-value expression
// Access the node directly in the slice to avoid the write barrier node := p.builder.NodeAt(ref)
// that NodeAt's nodes-pointer setup would trigger. node.Raw = p.rangeOfToken(startB[:len(startB)-len(b)], b)
p.builder.tree.nodes[ref].Raw = p.rangeOfToken(startB[:len(startB)-len(b)], b)
return ref, b, err return ref, b, err
} }
@@ -366,7 +376,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
ref := invalidReference ref := invalidReference
if len(b) == 0 { if len(b) == 0 {
return ref, nil, NewParserError(b, p.offsetOf(b), "expected value, not eof") return ref, nil, NewParserError(b, "expected value, not eof")
} }
var err error var err error
@@ -411,7 +421,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
return ref, b, err return ref, b, err
case 't': case 't':
if !scanFollowsTrue(b) { if !scanFollowsTrue(b) {
return ref, nil, NewParserError(atmost(b, 4), p.offsetOf(b), "expected 'true'") return ref, nil, NewParserError(atmost(b, 4), "expected 'true'")
} }
ref = p.builder.Push(Node{ ref = p.builder.Push(Node{
@@ -422,7 +432,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
return ref, b[4:], nil return ref, b[4:], nil
case 'f': case 'f':
if !scanFollowsFalse(b) { if !scanFollowsFalse(b) {
return ref, nil, NewParserError(atmost(b, 5), p.offsetOf(b), "expected 'false'") return ref, nil, NewParserError(atmost(b, 5), "expected 'false'")
} }
ref = p.builder.Push(Node{ ref = p.builder.Push(Node{
@@ -449,7 +459,7 @@ func atmost(b []byte, n int) []byte {
} }
func (p *Parser) parseLiteralString(b []byte) ([]byte, []byte, []byte, error) { func (p *Parser) parseLiteralString(b []byte) ([]byte, []byte, []byte, error) {
v, rest, err := scanLiteralString(b, p.offsetOf(b)) v, rest, err := scanLiteralString(b)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -481,7 +491,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
if len(b) == 0 { if len(b) == 0 {
return parent, nil, NewParserError(previousB[:1], p.offsetOf(previousB), "inline table is incomplete") return parent, nil, NewParserError(previousB[:1], "inline table is incomplete")
} }
if b[0] == '}' { if b[0] == '}' {
@@ -489,7 +499,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
} }
if !first { if !first {
b, err = expect(',', b, p.offsetOf(b)) b, err = expect(',', b)
if err != nil { if err != nil {
return parent, nil, err return parent, nil, err
} }
@@ -513,7 +523,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
first = false first = false
} }
rest, err := expect('}', b, p.offsetOf(b)) rest, err := expect('}', b)
return parent, rest, err return parent, rest, err
} }
@@ -562,7 +572,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
} }
if len(b) == 0 { if len(b) == 0 {
return parent, nil, NewParserError(arrayStart[:1], p.offsetOf(arrayStart), "array is incomplete") return parent, nil, NewParserError(arrayStart[:1], "array is incomplete")
} }
if b[0] == ']' { if b[0] == ']' {
@@ -571,7 +581,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
if b[0] == ',' { if b[0] == ',' {
if first { if first {
return parent, nil, NewParserError(b[0:1], p.offsetOf(b), "array cannot start with comma") return parent, nil, NewParserError(b[0:1], "array cannot start with comma")
} }
b = b[1:] b = b[1:]
@@ -583,7 +593,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
addChild(cref) addChild(cref)
} }
} else if !first { } else if !first {
return parent, nil, NewParserError(b[0:1], p.offsetOf(b), "array elements must be separated by commas") return parent, nil, NewParserError(b[0:1], "array elements must be separated by commas")
} }
// TOML allows trailing commas in arrays. // TOML allows trailing commas in arrays.
@@ -610,7 +620,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
first = false first = false
} }
rest, err := expect(']', b, p.offsetOf(b)) rest, err := expect(']', b)
return parent, rest, err return parent, rest, err
} }
@@ -665,7 +675,7 @@ func (p *Parser) parseOptionalWhitespaceCommentNewline(b []byte) (reference, []b
} }
func (p *Parser) parseMultilineLiteralString(b []byte) ([]byte, []byte, []byte, error) { func (p *Parser) parseMultilineLiteralString(b []byte) ([]byte, []byte, []byte, error) {
token, rest, err := scanMultilineLiteralString(b, p.offsetOf(b)) token, rest, err := scanMultilineLiteralString(b)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -694,7 +704,7 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
// mlb-quotes = 1*2quotation-mark // mlb-quotes = 1*2quotation-mark
// mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii // mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// mlb-escaped-nl = escape ws newline *( wschar / newline ) // mlb-escaped-nl = escape ws newline *( wschar / newline )
token, escaped, rest, err := scanMultilineBasicString(b, p.offsetOf(b)) token, escaped, rest, err := scanMultilineBasicString(b)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -711,15 +721,14 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
// fast path // fast path
startIdx := i startIdx := i
endIdx := len(token) - len(`"""`) endIdx := len(token) - len(`"""`)
tokenBase := p.offsetOf(token)
if !escaped { if !escaped {
str := token[startIdx:endIdx] str := token[startIdx:endIdx]
invalidIdx := characters.Utf8TomlValidAlreadyEscaped(str) highlight := characters.Utf8TomlValidAlreadyEscaped(str)
if invalidIdx < 0 { if len(highlight) == 0 {
return token, str, rest, nil return token, str, rest, nil
} }
return nil, nil, nil, NewParserError(str[invalidIdx:invalidIdx+1], tokenBase+startIdx+invalidIdx, "invalid UTF-8") return nil, nil, nil, NewParserError(highlight, "invalid UTF-8")
} }
var builder bytes.Buffer var builder bytes.Buffer
@@ -784,14 +793,14 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
case 'e': case 'e':
builder.WriteByte(0x1B) builder.WriteByte(0x1B)
case 'u': case 'u':
x, err := hexToRune(atmost(token[i+1:], 4), tokenBase+i+1, 4) x, err := hexToRune(atmost(token[i+1:], 4), 4)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
builder.WriteRune(x) builder.WriteRune(x)
i += 4 i += 4
case 'U': case 'U':
x, err := hexToRune(atmost(token[i+1:], 8), tokenBase+i+1, 8) x, err := hexToRune(atmost(token[i+1:], 8), 8)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -799,13 +808,13 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
builder.WriteRune(x) builder.WriteRune(x)
i += 8 i += 8
default: default:
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid escaped character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], "invalid escaped character %#U", c)
} }
i++ i++
} else { } else {
size := characters.Utf8ValidNext(token[i:]) size := characters.Utf8ValidNext(token[i:])
if size == 0 { if size == 0 {
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], "invalid character %#U", c)
} }
builder.Write(token[i : i+size]) builder.Write(token[i : i+size])
i += size i += size
@@ -860,9 +869,12 @@ func (p *Parser) parseKey(b []byte) (reference, []byte, error) {
func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) { func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) {
if len(b) == 0 { if len(b) == 0 {
return nil, nil, nil, NewParserError(b, p.offsetOf(b), "expected key but found none") return nil, nil, nil, NewParserError(b, "expected key but found none")
} }
// simple-key = quoted-key / unquoted-key
// unquoted-key = 1*( ALPHA / DIGIT / %x2D / %x5F ) ; A-Z / a-z / 0-9 / - / _
// quoted-key = basic-string / literal-string
switch { switch {
case b[0] == '\'': case b[0] == '\'':
return p.parseLiteralString(b) return p.parseLiteralString(b)
@@ -872,7 +884,7 @@ func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) {
key, rest = scanUnquotedKey(b) key, rest = scanUnquotedKey(b)
return key, key, rest, nil return key, key, rest, nil
default: default:
return nil, nil, nil, NewParserError(b[0:1], p.offsetOf(b), "invalid character at start of key: %c", b[0]) return nil, nil, nil, NewParserError(b[0:1], "invalid character at start of key: %c", b[0])
} }
} }
@@ -892,7 +904,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
// escape-seq-char =/ %x74 ; t tab U+0009 // escape-seq-char =/ %x74 ; t tab U+0009
// escape-seq-char =/ %x75 4HEXDIG ; uXXXX U+XXXX // escape-seq-char =/ %x75 4HEXDIG ; uXXXX U+XXXX
// escape-seq-char =/ %x55 8HEXDIG ; UXXXXXXXX U+XXXXXXXX // escape-seq-char =/ %x55 8HEXDIG ; UXXXXXXXX U+XXXXXXXX
token, escaped, rest, err := scanBasicString(b, p.offsetOf(b)) token, escaped, rest, err := scanBasicString(b)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -903,15 +915,13 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
// Fast path. If there is no escape sequence, the string should just be // Fast path. If there is no escape sequence, the string should just be
// an UTF-8 encoded string, which is the same as Go. In that case, // an UTF-8 encoded string, which is the same as Go. In that case,
// validate the string and return a direct reference to the buffer. // validate the string and return a direct reference to the buffer.
tokenBase := p.offsetOf(token)
if !escaped { if !escaped {
str := token[startIdx:endIdx] str := token[startIdx:endIdx]
invalidIdx := characters.Utf8TomlValidAlreadyEscaped(str) highlight := characters.Utf8TomlValidAlreadyEscaped(str)
if invalidIdx < 0 { if len(highlight) == 0 {
return token, str, rest, nil return token, str, rest, nil
} }
return nil, nil, nil, NewParserError(str[invalidIdx:invalidIdx+1], tokenBase+startIdx+invalidIdx, "invalid UTF-8") return nil, nil, nil, NewParserError(highlight, "invalid UTF-8")
} }
i := startIdx i := startIdx
@@ -942,7 +952,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
case 'e': case 'e':
builder.WriteByte(0x1B) builder.WriteByte(0x1B)
case 'u': case 'u':
x, err := hexToRune(token[i+1:len(token)-1], tokenBase+i+1, 4) x, err := hexToRune(token[i+1:len(token)-1], 4)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -950,7 +960,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
builder.WriteRune(x) builder.WriteRune(x)
i += 4 i += 4
case 'U': case 'U':
x, err := hexToRune(token[i+1:len(token)-1], tokenBase+i+1, 8) x, err := hexToRune(token[i+1:len(token)-1], 8)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -958,13 +968,13 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
builder.WriteRune(x) builder.WriteRune(x)
i += 8 i += 8
default: default:
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid escaped character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], "invalid escaped character %#U", c)
} }
i++ i++
} else { } else {
size := characters.Utf8ValidNext(token[i:]) size := characters.Utf8ValidNext(token[i:])
if size == 0 { if size == 0 {
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], "invalid character %#U", c)
} }
builder.Write(token[i : i+size]) builder.Write(token[i : i+size])
i += size i += size
@@ -974,9 +984,9 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
return token, builder.Bytes(), rest, nil return token, builder.Bytes(), rest, nil
} }
func hexToRune(b []byte, base int, length int) (rune, error) { func hexToRune(b []byte, length int) (rune, error) {
if len(b) < length { if len(b) < length {
return -1, NewParserError(b, base, "unicode point needs %d character, not %d", length, len(b)) return -1, NewParserError(b, "unicode point needs %d character, not %d", length, len(b))
} }
b = b[:length] b = b[:length]
@@ -991,13 +1001,13 @@ func hexToRune(b []byte, base int, length int) (rune, error) {
case 'A' <= c && c <= 'F': case 'A' <= c && c <= 'F':
d = uint32(c - 'A' + 10) d = uint32(c - 'A' + 10)
default: default:
return -1, NewParserError(b[i:i+1], base+i, "non-hex character") return -1, NewParserError(b[i:i+1], "non-hex character")
} }
r = r*16 + d r = r*16 + d
} }
if r > unicode.MaxRune || 0xD800 <= r && r < 0xE000 { if r > unicode.MaxRune || 0xD800 <= r && r < 0xE000 {
return -1, NewParserError(b, base, "escape sequence is invalid Unicode code point") return -1, NewParserError(b, "escape sequence is invalid Unicode code point")
} }
return rune(r), nil return rune(r), nil
@@ -1017,7 +1027,7 @@ func (p *Parser) parseIntOrFloatOrDateTime(b []byte) (reference, []byte, error)
switch b[0] { switch b[0] {
case 'i': case 'i':
if !scanFollowsInf(b) { if !scanFollowsInf(b) {
return invalidReference, nil, NewParserError(atmost(b, 3), p.offsetOf(b), "expected 'inf'") return invalidReference, nil, NewParserError(atmost(b, 3), "expected 'inf'")
} }
return p.builder.Push(Node{ return p.builder.Push(Node{
@@ -1027,7 +1037,7 @@ func (p *Parser) parseIntOrFloatOrDateTime(b []byte) (reference, []byte, error)
}), b[3:], nil }), b[3:], nil
case 'n': case 'n':
if !scanFollowsNan(b) { if !scanFollowsNan(b) {
return invalidReference, nil, NewParserError(atmost(b, 3), p.offsetOf(b), "expected 'nan'") return invalidReference, nil, NewParserError(atmost(b, 3), "expected 'nan'")
} }
return p.builder.Push(Node{ return p.builder.Push(Node{
@@ -1186,7 +1196,7 @@ func (p *Parser) scanIntOrFloat(b []byte) (reference, []byte, error) {
}), b[i+3:], nil }), b[i+3:], nil
} }
return invalidReference, nil, NewParserError(b[i:i+1], p.offsetOf(b)+i, "unexpected character 'i' while scanning for a number") return invalidReference, nil, NewParserError(b[i:i+1], "unexpected character 'i' while scanning for a number")
} }
if c == 'n' { if c == 'n' {
@@ -1198,14 +1208,14 @@ func (p *Parser) scanIntOrFloat(b []byte) (reference, []byte, error) {
}), b[i+3:], nil }), b[i+3:], nil
} }
return invalidReference, nil, NewParserError(b[i:i+1], p.offsetOf(b)+i, "unexpected character 'n' while scanning for a number") return invalidReference, nil, NewParserError(b[i:i+1], "unexpected character 'n' while scanning for a number")
} }
break break
} }
if i == 0 { if i == 0 {
return invalidReference, b, NewParserError(b, p.offsetOf(b), "incomplete number") return invalidReference, b, NewParserError(b, "incomplete number")
} }
kind := Integer kind := Integer
@@ -1242,13 +1252,13 @@ func isValidBinaryRune(r byte) bool {
return r == '0' || r == '1' || r == '_' return r == '0' || r == '1' || r == '_'
} }
func expect(x byte, b []byte, base int) ([]byte, error) { func expect(x byte, b []byte) ([]byte, error) {
if len(b) == 0 { if len(b) == 0 {
return nil, NewParserError(b, base, "expected character %c but the document ended here", x) return nil, NewParserError(b, "expected character %c but the document ended here", x)
} }
if b[0] != x { if b[0] != x {
return nil, NewParserError(b[0:1], base, "expected character %c", x) return nil, NewParserError(b[0:1], "expected character %c", x)
} }
return b[1:], nil return b[1:], nil
-91
View File
@@ -1,7 +1,6 @@
package unstable package unstable
import ( import (
"errors"
"fmt" "fmt"
"strconv" "strconv"
"strings" "strings"
@@ -674,96 +673,6 @@ key3 = "value3"
assert.Equal(t, []string{"key1", "key2", "key3"}, keys) assert.Equal(t, []string{"key1", "key2", "key3"}, keys)
} }
func TestErrorOffsetAfterComment(t *testing.T) {
input := []byte("# comment\n= \"value\"")
p := Parser{}
p.Reset(input)
for p.NextExpression() {
}
err := p.Error()
if err == nil {
t.Fatal("expected an error")
}
var perr *ParserError
if !errors.As(err, &perr) {
t.Fatalf("expected ParserError, got %T", err)
}
if perr.Offset != 10 {
t.Errorf("offset: got %d, want 10", perr.Offset)
}
shape := p.Shape(Range{Offset: uint32(perr.Offset), Length: uint32(len(perr.Highlight))})
if shape.Start.Line != 2 || shape.Start.Column != 1 {
t.Errorf("position: got %d:%d, want 2:1", shape.Start.Line, shape.Start.Column)
}
}
func TestErrorHighlightPositions(t *testing.T) {
examples := []struct {
desc string
input string
wantLine int
wantColumn int
}{
{
desc: "invalid key start after comment",
input: "# comment\n= \"value\"",
wantLine: 2,
wantColumn: 1,
},
{
desc: "invalid key start on first line",
input: "= \"value\"",
wantLine: 1,
wantColumn: 1,
},
{
desc: "invalid key after multiple comments",
input: "# comment 1\n# comment 2\n= \"value\"",
wantLine: 3,
wantColumn: 1,
},
{
desc: "invalid key after valid key-value",
input: "a = 1\n= \"value\"",
wantLine: 2,
wantColumn: 1,
},
{
desc: "invalid key after whitespace on line",
input: "a = 1\n = \"value\"",
wantLine: 2,
wantColumn: 3,
},
}
for _, e := range examples {
t.Run(e.desc, func(t *testing.T) {
p := Parser{}
p.Reset([]byte(e.input))
for p.NextExpression() {
}
err := p.Error()
if err == nil {
t.Fatal("expected an error")
}
var perr *ParserError
if !errors.As(err, &perr) {
t.Fatalf("expected ParserError, got %T", err)
}
shape := p.Shape(Range{Offset: uint32(perr.Offset), Length: uint32(len(perr.Highlight))})
if shape.Start.Line != e.wantLine {
t.Errorf("line: got %d, want %d", shape.Start.Line, e.wantLine)
}
if shape.Start.Column != e.wantColumn {
t.Errorf("column: got %d, want %d", shape.Start.Column, e.wantColumn)
}
})
}
}
func ExampleParser() { func ExampleParser() {
doc := ` doc := `
hello = "world" hello = "world"
+72 -27
View File
@@ -47,31 +47,48 @@ func isUnquotedKeyChar(r byte) bool {
return (r >= 'A' && r <= 'Z') || (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9') || r == '-' || r == '_' return (r >= 'A' && r <= 'Z') || (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9') || r == '-' || r == '_'
} }
func scanLiteralString(b []byte, base int) ([]byte, []byte, error) { func scanLiteralString(b []byte) ([]byte, []byte, error) {
// literal-string = apostrophe *literal-char apostrophe
// apostrophe = %x27 ; ' apostrophe
// literal-char = %x09 / %x20-26 / %x28-7E / non-ascii
for i := 1; i < len(b); { for i := 1; i < len(b); {
switch b[i] { switch b[i] {
case '\'': case '\'':
return b[:i+1], b[i+1:], nil return b[:i+1], b[i+1:], nil
case '\n', '\r': case '\n', '\r':
return nil, nil, NewParserError(b[i:i+1], base+i, "literal strings cannot have new lines") return nil, nil, NewParserError(b[i:i+1], "literal strings cannot have new lines")
} }
size := characters.Utf8ValidNext(b[i:]) size := characters.Utf8ValidNext(b[i:])
if size == 0 { if size == 0 {
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character") return nil, nil, NewParserError(b[i:i+1], "invalid character")
} }
i += size i += size
} }
return nil, nil, NewParserError(b[len(b):], base+len(b), "unterminated literal string") return nil, nil, NewParserError(b[len(b):], "unterminated literal string")
} }
func scanMultilineLiteralString(b []byte, base int) ([]byte, []byte, error) { func scanMultilineLiteralString(b []byte) ([]byte, []byte, error) {
// ml-literal-string = ml-literal-string-delim [ newline ] ml-literal-body
// ml-literal-string-delim
// ml-literal-string-delim = 3apostrophe
// ml-literal-body = *mll-content *( mll-quotes 1*mll-content ) [ mll-quotes ]
//
// mll-content = mll-char / newline
// mll-char = %x09 / %x20-26 / %x28-7E / non-ascii
// mll-quotes = 1*2apostrophe
for i := 3; i < len(b); { for i := 3; i < len(b); {
switch b[i] { switch b[i] {
case '\'': case '\'':
if scanFollowsMultilineLiteralStringDelimiter(b[i:]) { if scanFollowsMultilineLiteralStringDelimiter(b[i:]) {
i += 3 i += 3
// At that point we found 3 apostrophe, and i is the
// index of the byte after the third one. The scanner
// needs to be eager, because there can be an extra 2
// apostrophe that can be accepted at the end of the
// string.
if i >= len(b) || b[i] != '\'' { if i >= len(b) || b[i] != '\'' {
return b[:i], b[i:], nil return b[:i], b[i:], nil
} }
@@ -83,39 +100,39 @@ func scanMultilineLiteralString(b []byte, base int) ([]byte, []byte, error) {
i++ i++
if i < len(b) && b[i] == '\'' { if i < len(b) && b[i] == '\'' {
return nil, nil, NewParserError(b[i-3:i+1], base+i-3, "''' not allowed in multiline literal string") return nil, nil, NewParserError(b[i-3:i+1], "''' not allowed in multiline literal string")
} }
return b[:i], b[i:], nil return b[:i], b[i:], nil
} }
case '\r': case '\r':
if len(b) < i+2 { if len(b) < i+2 {
return nil, nil, NewParserError(b[len(b):], base+len(b), `need a \n after \r`) return nil, nil, NewParserError(b[len(b):], `need a \n after \r`)
} }
if b[i+1] != '\n' { if b[i+1] != '\n' {
return nil, nil, NewParserError(b[i:i+2], base+i, `need a \n after \r`) return nil, nil, NewParserError(b[i:i+2], `need a \n after \r`)
} }
i += 2 i += 2 // skip the \n
continue continue
} }
size := characters.Utf8ValidNext(b[i:]) size := characters.Utf8ValidNext(b[i:])
if size == 0 { if size == 0 {
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character") return nil, nil, NewParserError(b[i:i+1], "invalid character")
} }
i += size i += size
} }
return nil, nil, NewParserError(b[len(b):], base+len(b), `multiline literal string not terminated by '''`) return nil, nil, NewParserError(b[len(b):], `multiline literal string not terminated by '''`)
} }
func scanWindowsNewline(b []byte, base int) ([]byte, []byte, error) { func scanWindowsNewline(b []byte) ([]byte, []byte, error) {
const lenCRLF = 2 const lenCRLF = 2
if len(b) < lenCRLF { if len(b) < lenCRLF {
return nil, nil, NewParserError(b, base, "windows new line expected") return nil, nil, NewParserError(b, "windows new line expected")
} }
if b[1] != '\n' { if b[1] != '\n' {
return nil, nil, NewParserError(b, base, `windows new line should be \r\n`) return nil, nil, NewParserError(b, `windows new line should be \r\n`)
} }
return b[:lenCRLF], b[lenCRLF:], nil return b[:lenCRLF], b[lenCRLF:], nil
@@ -134,7 +151,13 @@ func scanWhitespace(b []byte) ([]byte, []byte) {
return b, b[len(b):] return b, b[len(b):]
} }
func scanComment(b []byte, base int) ([]byte, []byte, error) { func scanComment(b []byte) ([]byte, []byte, error) {
// comment-start-symbol = %x23 ; #
// non-ascii = %x80-D7FF / %xE000-10FFFF
// non-eol = %x09 / %x20-7F / non-ascii
//
// comment = comment-start-symbol *non-eol
for i := 1; i < len(b); { for i := 1; i < len(b); {
if b[i] == '\n' { if b[i] == '\n' {
return b[:i], b[i:], nil return b[:i], b[i:], nil
@@ -143,11 +166,11 @@ func scanComment(b []byte, base int) ([]byte, []byte, error) {
if i+1 < len(b) && b[i+1] == '\n' { if i+1 < len(b) && b[i+1] == '\n' {
return b[:i+1], b[i+1:], nil return b[:i+1], b[i+1:], nil
} }
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character in comment") return nil, nil, NewParserError(b[i:i+1], "invalid character in comment")
} }
size := characters.Utf8ValidNext(b[i:]) size := characters.Utf8ValidNext(b[i:])
if size == 0 { if size == 0 {
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character in comment") return nil, nil, NewParserError(b[i:i+1], "invalid character in comment")
} }
i += size i += size
@@ -156,7 +179,12 @@ func scanComment(b []byte, base int) ([]byte, []byte, error) {
return b, b[len(b):], nil return b, b[len(b):], nil
} }
func scanBasicString(b []byte, base int) ([]byte, bool, []byte, error) { func scanBasicString(b []byte) ([]byte, bool, []byte, error) {
// basic-string = quotation-mark *basic-char quotation-mark
// quotation-mark = %x22 ; "
// basic-char = basic-unescaped / escaped
// basic-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// escaped = escape escape-seq-char
escaped := false escaped := false
i := 1 i := 1
@@ -165,20 +193,31 @@ func scanBasicString(b []byte, base int) ([]byte, bool, []byte, error) {
case '"': case '"':
return b[:i+1], escaped, b[i+1:], nil return b[:i+1], escaped, b[i+1:], nil
case '\n', '\r': case '\n', '\r':
return nil, escaped, nil, NewParserError(b[i:i+1], base+i, "basic strings cannot have new lines") return nil, escaped, nil, NewParserError(b[i:i+1], "basic strings cannot have new lines")
case '\\': case '\\':
if len(b) < i+2 { if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[i:i+1], base+i, "need a character after \\") return nil, escaped, nil, NewParserError(b[i:i+1], "need a character after \\")
} }
escaped = true escaped = true
i++ // skip the next character i++ // skip the next character
} }
} }
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `basic string not terminated by "`) return nil, escaped, nil, NewParserError(b[len(b):], `basic string not terminated by "`)
} }
func scanMultilineBasicString(b []byte, base int) ([]byte, bool, []byte, error) { func scanMultilineBasicString(b []byte) ([]byte, bool, []byte, error) {
// ml-basic-string = ml-basic-string-delim [ newline ] ml-basic-body
// ml-basic-string-delim
// ml-basic-string-delim = 3quotation-mark
// ml-basic-body = *mlb-content *( mlb-quotes 1*mlb-content ) [ mlb-quotes ]
//
// mlb-content = mlb-char / newline / mlb-escaped-nl
// mlb-char = mlb-unescaped / escaped
// mlb-quotes = 1*2quotation-mark
// mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// mlb-escaped-nl = escape ws newline *( wschar / newline )
escaped := false escaped := false
i := 3 i := 3
@@ -188,6 +227,12 @@ func scanMultilineBasicString(b []byte, base int) ([]byte, bool, []byte, error)
if scanFollowsMultilineBasicStringDelimiter(b[i:]) { if scanFollowsMultilineBasicStringDelimiter(b[i:]) {
i += 3 i += 3
// At that point we found 3 apostrophe, and i is the
// index of the byte after the third one. The scanner
// needs to be eager, because there can be an extra 2
// apostrophe that can be accepted at the end of the
// string.
if i >= len(b) || b[i] != '"' { if i >= len(b) || b[i] != '"' {
return b[:i], escaped, b[i:], nil return b[:i], escaped, b[i:], nil
} }
@@ -199,27 +244,27 @@ func scanMultilineBasicString(b []byte, base int) ([]byte, bool, []byte, error)
i++ i++
if i < len(b) && b[i] == '"' { if i < len(b) && b[i] == '"' {
return nil, escaped, nil, NewParserError(b[i-3:i+1], base+i-3, `""" not allowed in multiline basic string`) return nil, escaped, nil, NewParserError(b[i-3:i+1], `""" not allowed in multiline basic string`)
} }
return b[:i], escaped, b[i:], nil return b[:i], escaped, b[i:], nil
} }
case '\\': case '\\':
if len(b) < i+2 { if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), "need a character after \\") return nil, escaped, nil, NewParserError(b[len(b):], "need a character after \\")
} }
escaped = true escaped = true
i++ // skip the next character i++ // skip the next character
case '\r': case '\r':
if len(b) < i+2 { if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `need a \n after \r`) return nil, escaped, nil, NewParserError(b[len(b):], `need a \n after \r`)
} }
if b[i+1] != '\n' { if b[i+1] != '\n' {
return nil, escaped, nil, NewParserError(b[i:i+2], base+i, `need a \n after \r`) return nil, escaped, nil, NewParserError(b[i:i+2], `need a \n after \r`)
} }
i++ // skip the \n i++ // skip the \n
} }
} }
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `multiline basic string not terminated by """`) return nil, escaped, nil, NewParserError(b[len(b):], `multiline basic string not terminated by """`)
} }