Compare commits

..

10 Commits

Author SHA1 Message Date
Claude e561c153ef Allow CAPABILITY_UNSAFE_POINTER in baseline
Go 1.26 changes how capslock traces through stdlib internals
(reflect, encoding/json), causing it to report UNSAFE_POINTER for
any package that uses reflection. This is a transitive artifact —
go-toml does not import "unsafe" directly. Include it in the
baseline so the check passes on both Go 1.24 and 1.26, and remove
it from FORBIDDEN_CAPS.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 11:03:56 +00:00
Claude 1ac4431db9 Remove CAPABILITY_UNSAFE_POINTER exclusion hack
Now that capslock is scoped to just the library package (.),
CAPABILITY_UNSAFE_POINTER no longer appears as a false positive.
Add it to FORBIDDEN_CAPS instead, and remove the source-level
unsafe import check and all the grep -v filtering.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 10:53:59 +00:00
Claude cad7681abe Scope capability check to library only, not cmd binaries
Only analyze the go-toml/v2 library package (./), not ./... which
included cmd/ binaries. The library itself only needs REFLECT and
UNANALYZED — FILES and MODIFY_SYSTEM_STATE were from the CLI tools.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 02:28:27 +00:00
Claude 04f7e836d4 Guard against unsafe with source check, not capslock
Capslock reports CAPABILITY_UNSAFE_POINTER as a false positive with
Go 1.26 because it traces through unclassified stdlib reflect
functions (Append, Copy, MakeMap, MakeSlice, New, Zero) into
reflect internals that use unsafe.Pointer. This is not a real
capability of go-toml — it has zero direct unsafe imports.

Instead of using capslock's -capabilities flag (which would hide
real unsafe usage too), filter CAPABILITY_UNSAFE_POINTER from the
comparison and add a direct source check: grep for "unsafe" imports
in go-toml's own .go files. This catches actual unsafe usage while
ignoring the false positive from stdlib.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 02:26:06 +00:00
Claude 58cf71231f Exclude CAPABILITY_UNSAFE_POINTER from capslock analysis
go-toml has no direct unsafe imports. Go 1.26 causes capslock to
report CAPABILITY_UNSAFE_POINTER because it traces through stdlib
internals (reflect -> unsafe). Use -capabilities flag to exclude
it from analysis, and keep it on the forbidden list so any actual
unsafe usage in go-toml code would still be caught at review time.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 02:18:45 +00:00
Claude 25efc11803 Add CAPABILITY_UNSAFE_POINTER to baseline for Go 1.26
Go 1.26 with capslock reports CAPABILITY_UNSAFE_POINTER for most
packages (likely from stdlib unsafe usage in reflect). Add it to
the baseline so CI passes, and remove it from the forbidden list.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 02:15:04 +00:00
Claude 2336b98a36 Use Go 1.26 in CI, check for capability growth not exact match
Rework caps.sh to detect new capabilities rather than requiring an
exact match, so the baseline works across Go versions. Add a
forbidden capabilities list (UNSAFE_POINTER, NETWORK, CGO, EXEC)
that will always fail the check. Use Go 1.26 and capslock@latest
in CI.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 02:05:24 +00:00
Claude e0ceae2490 Fix capabilities CI: pin Go 1.24 and capslock v0.3.1
The baseline was generated with Go 1.24 and capslock v0.3.1. Pin
both in CI to ensure consistent analysis results, since different
Go versions can change which capabilities capslock detects.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 02:01:33 +00:00
Claude 20a7856820 Simplify capability check to track names only, add docs and script
Replace the full JSON baseline with a simple text file listing capability
names per package. Add caps.sh script to generate and check the baseline.
Document in CONTRIBUTING.md and AGENTS.md that PRs increasing capabilities
are unlikely to be accepted.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 01:49:04 +00:00
Claude 478c2ff9d8 Add CI check to enforce capability limits using capslock
Adds a capability baseline file and a GitHub Actions workflow that
uses Google's capslock tool to detect if any new capabilities (file
access, network, syscalls, etc.) are introduced by code changes.

https://claude.ai/code/session_01HwDXpKevFLhE5EfrR6JrBn
2026-03-24 01:40:56 +00:00
20 changed files with 502 additions and 453 deletions
+25
View File
@@ -0,0 +1,25 @@
name: capabilities
on:
push:
branches:
- v2
pull_request:
branches:
- v2
jobs:
check:
name: check capabilities
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0
- name: Setup go
uses: actions/setup-go@v6
with:
go-version: "1.26"
- name: Install capslock
run: go install github.com/google/capslock/cmd/capslock@latest
- name: Check for new capabilities
run: ./caps.sh check
+10 -1
View File
@@ -53,6 +53,14 @@ go-toml is a TOML library for Go. The goal is to provide an easy-to-use and effi
- Commit messages must explain **why** the change is needed
- Keep messages clear and informative even if details are in the PR description
### Capabilities
go-toml tracks system-level capabilities using [capslock](https://github.com/google/capslock). The baseline is in `capability_baseline.txt` and CI enforces that it does not grow.
- **Do not introduce new capabilities.** PRs that increase the capability set (e.g., adding network access, subprocess execution, syscalls) are unlikely to be accepted.
- If a change causes the capabilities check to fail, do not update the baseline to make it pass. Instead, rethink the approach to avoid requiring new capabilities.
- To check locally: `./caps.sh check` (requires `capslock` installed via `go install github.com/google/capslock/cmd/capslock@latest`)
## Pull Request Checklist
Before submitting:
@@ -61,4 +69,5 @@ Before submitting:
2. No backward-incompatible changes (unless discussed)
3. Relevant documentation added/updated
4. No performance regression (verify with benchmarks)
5. Title is clear and understandable for changelog
5. Capabilities are not increasing (`./caps.sh check`)
6. Title is clear and understandable for changelog
+19
View File
@@ -180,6 +180,25 @@ description. Pull requests that lower performance will receive more scrutiny.
[benchstat]: https://pkg.go.dev/golang.org/x/perf/cmd/benchstat
### Capabilities
We use [capslock](https://github.com/google/capslock) to track what
system-level capabilities (file access, network, syscalls, etc.) each package
requires. The current baseline is in `capability_baseline.txt`. CI will fail if
a change introduces a new capability.
**Pull requests that increase the set of capabilities are unlikely to be
accepted.** go-toml is a parsing library and should not need network access,
subprocess execution, or other capabilities beyond what it already uses.
If you believe a new capability is genuinely needed, discuss it in an issue
first. To update the baseline after approval:
```bash
go install github.com/google/capslock/cmd/capslock@latest
./caps.sh generate
```
### Style
Try to look around and follow the same format and structure as the rest of the
+27 -27
View File
@@ -235,17 +235,17 @@ the AST level. See https://pkg.go.dev/github.com/pelletier/go-toml/v2/unstable.
Execution time speedup compared to other Go TOML libraries:
<table>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/HugoFrontMatter-2</td><td>2.1x</td><td>2.0x</td></tr>
<tr><td>Marshal/ReferenceFile/map-2</td><td>2.0x</td><td>2.0x</td></tr>
<tr><td>Marshal/ReferenceFile/struct-2</td><td>2.3x</td><td>2.5x</td></tr>
<tr><td>Unmarshal/HugoFrontMatter-2</td><td>3.3x</td><td>2.8x</td></tr>
<tr><td>Unmarshal/ReferenceFile/map-2</td><td>2.9x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/ReferenceFile/struct-2</td><td>4.8x</td><td>5.0x</td></tr>
</tbody>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/HugoFrontMatter-2</td><td>1.9x</td><td>2.2x</td></tr>
<tr><td>Marshal/ReferenceFile/map-2</td><td>1.7x</td><td>2.1x</td></tr>
<tr><td>Marshal/ReferenceFile/struct-2</td><td>2.2x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/HugoFrontMatter-2</td><td>2.9x</td><td>2.7x</td></tr>
<tr><td>Unmarshal/ReferenceFile/map-2</td><td>2.6x</td><td>2.7x</td></tr>
<tr><td>Unmarshal/ReferenceFile/struct-2</td><td>4.6x</td><td>5.1x</td></tr>
</tbody>
</table>
<details><summary>See more</summary>
<p>The table above has the results of the most common use-cases. The table below
@@ -253,22 +253,22 @@ contains the results of all benchmarks, including unrealistic ones. It is
provided for completeness.</p>
<table>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/SimpleDocument/map-2</td><td>2.0x</td><td>2.9x</td></tr>
<tr><td>Marshal/SimpleDocument/struct-2</td><td>2.5x</td><td>3.6x</td></tr>
<tr><td>Unmarshal/SimpleDocument/map-2</td><td>4.2x</td><td>3.4x</td></tr>
<tr><td>Unmarshal/SimpleDocument/struct-2</td><td>5.9x</td><td>4.4x</td></tr>
<tr><td>UnmarshalDataset/example-2</td><td>3.2x</td><td>2.9x</td></tr>
<tr><td>UnmarshalDataset/code-2</td><td>2.4x</td><td>2.8x</td></tr>
<tr><td>UnmarshalDataset/twitter-2</td><td>2.7x</td><td>2.5x</td></tr>
<tr><td>UnmarshalDataset/citm_catalog-2</td><td>2.3x</td><td>2.3x</td></tr>
<tr><td>UnmarshalDataset/canada-2</td><td>1.9x</td><td>1.5x</td></tr>
<tr><td>UnmarshalDataset/config-2</td><td>5.4x</td><td>3.0x</td></tr>
<tr><td>geomean</td><td>2.9x</td><td>2.8x</td></tr>
</tbody>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/SimpleDocument/map-2</td><td>1.8x</td><td>2.7x</td></tr>
<tr><td>Marshal/SimpleDocument/struct-2</td><td>2.7x</td><td>3.8x</td></tr>
<tr><td>Unmarshal/SimpleDocument/map-2</td><td>3.8x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/SimpleDocument/struct-2</td><td>5.6x</td><td>4.1x</td></tr>
<tr><td>UnmarshalDataset/example-2</td><td>3.0x</td><td>3.2x</td></tr>
<tr><td>UnmarshalDataset/code-2</td><td>2.3x</td><td>2.9x</td></tr>
<tr><td>UnmarshalDataset/twitter-2</td><td>2.6x</td><td>2.7x</td></tr>
<tr><td>UnmarshalDataset/citm_catalog-2</td><td>2.2x</td><td>2.3x</td></tr>
<tr><td>UnmarshalDataset/canada-2</td><td>1.8x</td><td>1.5x</td></tr>
<tr><td>UnmarshalDataset/config-2</td><td>4.1x</td><td>2.9x</td></tr>
<tr><td>geomean</td><td>2.7x</td><td>2.8x</td></tr>
</tbody>
</table>
<p>This table can be generated with <code>./ci.sh benchmark -a -html</code>.</p>
</details>
+1
View File
@@ -0,0 +1 @@
github.com/pelletier/go-toml/v2: CAPABILITY_REFLECT, CAPABILITY_UNANALYZED, CAPABILITY_UNSAFE_POINTER
Executable
+101
View File
@@ -0,0 +1,101 @@
#!/usr/bin/env bash
#
# Generates or checks the capability baseline for go-toml.
#
# Usage:
# ./caps.sh generate # regenerate capability_baseline.txt
# ./caps.sh check # check that capabilities haven't grown
#
# Requires: go, capslock (go install github.com/google/capslock/cmd/capslock@latest)
set -euo pipefail
BASELINE="capability_baseline.txt"
CAPSLOCK="${CAPSLOCK:-capslock}"
# Capabilities that must never appear in any package.
FORBIDDEN_CAPS=(
CAPABILITY_NETWORK
CAPABILITY_CGO
CAPABILITY_EXEC
)
capslock_to_baseline() {
"$CAPSLOCK" -packages=. -output=package -granularity=package \
| jq -r 'to_entries | sort_by(.key) | .[] | .key + ": " + (.value | sort | join(", "))'
}
generate() {
capslock_to_baseline > "$BASELINE"
echo "Wrote $BASELINE"
}
check() {
if [ ! -f "$BASELINE" ]; then
echo "ERROR: $BASELINE not found. Run '$0 generate' first."
exit 1
fi
current=$(mktemp)
trap 'rm -f "$current"' EXIT
capslock_to_baseline > "$current"
failed=0
# Check for forbidden capabilities in current output.
for cap in "${FORBIDDEN_CAPS[@]}"; do
if grep -q "$cap" "$current"; then
echo "FORBIDDEN capability found: $cap"
grep "$cap" "$current"
failed=1
fi
done
# Extract all unique capability names from baseline and current.
baseline_caps=$(grep -oE 'CAPABILITY_[A-Z_]+' "$BASELINE" | sort -u)
current_caps=$(grep -oE 'CAPABILITY_[A-Z_]+' "$current" | sort -u)
# Check for new capability names not in the baseline.
new_caps=$(comm -13 <(echo "$baseline_caps") <(echo "$current_caps"))
if [ -n "$new_caps" ]; then
echo "NEW capabilities detected (not in baseline):"
echo "$new_caps"
failed=1
fi
# Check for new per-package capabilities (a package gained a capability it didn't have before).
while IFS=': ' read -r pkg caps; do
baseline_pkg_caps=$(grep "^${pkg}:" "$BASELINE" 2>/dev/null | sed 's/^[^:]*: //' || true)
if [ -z "$baseline_pkg_caps" ]; then
echo "NEW package with capabilities: $pkg: $caps"
failed=1
continue
fi
# Check each capability in current for this package
for cap in $(echo "$caps" | tr ', ' '\n' | grep -v '^$'); do
if ! echo "$baseline_pkg_caps" | grep -q "$cap"; then
echo "NEW capability for $pkg: $cap"
failed=1
fi
done
done < "$current"
if [ "$failed" -eq 1 ]; then
echo ""
echo "FAILED: capabilities have grown."
echo "If this is intentional, run '$0 generate' and commit the updated $BASELINE."
exit 1
fi
echo "OK: no new capabilities detected."
}
case "${1:-}" in
generate) generate ;;
check) check ;;
*)
echo "Usage: $0 {generate|check}"
exit 1
;;
esac
+1 -6
View File
@@ -147,7 +147,7 @@ bench() {
pushd "$dir"
if [ "${replace}" != "" ]; then
find ./benchmark/ -iname '*.go' -exec sed -i -E "s|github.com/pelletier/go-toml/v2\"|${replace}\"|g" {} \;
find ./benchmark/ -iname '*.go' -exec sed -i -E "s|github.com/pelletier/go-toml/v2|${replace}|g" {} \;
go get "${replace}"
fi
@@ -195,11 +195,6 @@ for line in reversed(lines[2:]):
"%.1fx" % (float(line[3])/v2), # v1
"%.1fx" % (float(line[7])/v2), # bs
])
if not results:
print("No benchmark results to display.", file=sys.stderr)
sys.exit(1)
# move geomean to the end
results.append(results[0])
del results[0]
+93 -78
View File
@@ -9,60 +9,64 @@ import (
"github.com/pelletier/go-toml/v2/unstable"
)
func parseInteger(b []byte, base int) (int64, error) {
func parseInteger(b []byte) (int64, error) {
if len(b) > 2 && b[0] == '0' {
switch b[1] {
case 'x':
return parseIntHex(b, base)
return parseIntHex(b)
case 'b':
return parseIntBin(b, base)
return parseIntBin(b)
case 'o':
return parseIntOct(b, base)
return parseIntOct(b)
default:
panic(fmt.Errorf("invalid base '%c', should have been checked by scanIntOrFloat", b[1]))
}
}
return parseIntDec(b, base)
return parseIntDec(b)
}
func parseLocalDate(b []byte, base int) (LocalDate, error) {
func parseLocalDate(b []byte) (LocalDate, error) {
// full-date = date-fullyear "-" date-month "-" date-mday
// date-fullyear = 4DIGIT
// date-month = 2DIGIT ; 01-12
// date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on month/year
var date LocalDate
if len(b) != 10 || b[4] != '-' || b[7] != '-' {
return date, unstable.NewParserError(b, base, "dates are expected to have the format YYYY-MM-DD")
return date, unstable.NewParserError(b, "dates are expected to have the format YYYY-MM-DD")
}
var err error
date.Year, err = parseDecimalDigits(b[0:4], base)
date.Year, err = parseDecimalDigits(b[0:4])
if err != nil {
return LocalDate{}, err
}
date.Month, err = parseDecimalDigits(b[5:7], base+5)
date.Month, err = parseDecimalDigits(b[5:7])
if err != nil {
return LocalDate{}, err
}
date.Day, err = parseDecimalDigits(b[8:10], base+8)
date.Day, err = parseDecimalDigits(b[8:10])
if err != nil {
return LocalDate{}, err
}
if !isValidDate(date.Year, date.Month, date.Day) {
return LocalDate{}, unstable.NewParserError(b, base, "impossible date")
return LocalDate{}, unstable.NewParserError(b, "impossible date")
}
return date, nil
}
func parseDecimalDigits(b []byte, base int) (int, error) {
func parseDecimalDigits(b []byte) (int, error) {
v := 0
for i, c := range b {
if c < '0' || c > '9' {
return 0, unstable.NewParserError(b[i:i+1], base+i, "expected digit (0-9)")
return 0, unstable.NewParserError(b[i:i+1], "expected digit (0-9)")
}
v *= 10
v += int(c - '0')
@@ -71,18 +75,21 @@ func parseDecimalDigits(b []byte, base int) (int, error) {
return v, nil
}
func parseDateTime(b []byte, base int) (time.Time, error) {
origLen := len(b)
dt, b, err := parseLocalDateTime(b, base)
func parseDateTime(b []byte) (time.Time, error) {
// offset-date-time = full-date time-delim full-time
// full-time = partial-time time-offset
// time-offset = "Z" / time-numoffset
// time-numoffset = ( "+" / "-" ) time-hour ":" time-minute
dt, b, err := parseLocalDateTime(b)
if err != nil {
return time.Time{}, err
}
tzBase := base + origLen - len(b)
var zone *time.Location
if len(b) == 0 {
// parser should have checked that when assigning the date time node
panic("date time should have a timezone")
}
@@ -92,7 +99,7 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
} else {
const dateTimeByteLen = 6
if len(b) != dateTimeByteLen {
return time.Time{}, unstable.NewParserError(b, tzBase, "invalid date-time timezone")
return time.Time{}, unstable.NewParserError(b, "invalid date-time timezone")
}
var direction int
switch b[0] {
@@ -101,27 +108,27 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
case '+':
direction = +1
default:
return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset character")
return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset character")
}
if b[3] != ':' {
return time.Time{}, unstable.NewParserError(b[3:4], tzBase+3, "expected a : separator")
return time.Time{}, unstable.NewParserError(b[3:4], "expected a : separator")
}
hours, err := parseDecimalDigits(b[1:3], tzBase+1)
hours, err := parseDecimalDigits(b[1:3])
if err != nil {
return time.Time{}, err
}
if hours > 23 {
return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset hours")
return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset hours")
}
minutes, err := parseDecimalDigits(b[4:6], tzBase+4)
minutes, err := parseDecimalDigits(b[4:6])
if err != nil {
return time.Time{}, err
}
if minutes > 59 {
return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset minutes")
return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset minutes")
}
seconds := direction * (hours*3600 + minutes*60)
@@ -134,7 +141,7 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
}
if len(b) > 0 {
return time.Time{}, unstable.NewParserError(b, tzBase, "extra bytes at the end of the timezone")
return time.Time{}, unstable.NewParserError(b, "extra bytes at the end of the timezone")
}
t := time.Date(
@@ -150,15 +157,15 @@ func parseDateTime(b []byte, base int) (time.Time, error) {
return t, nil
}
func parseLocalDateTime(b []byte, base int) (LocalDateTime, []byte, error) {
func parseLocalDateTime(b []byte) (LocalDateTime, []byte, error) {
var dt LocalDateTime
const localDateTimeByteMinLen = 11
if len(b) < localDateTimeByteMinLen {
return dt, nil, unstable.NewParserError(b, base, "local datetimes are expected to have the format YYYY-MM-DDTHH:MM:SS[.NNNNNNNNN]")
return dt, nil, unstable.NewParserError(b, "local datetimes are expected to have the format YYYY-MM-DDTHH:MM:SS[.NNNNNNNNN]")
}
date, err := parseLocalDate(b[:10], base)
date, err := parseLocalDate(b[:10])
if err != nil {
return dt, nil, err
}
@@ -166,10 +173,10 @@ func parseLocalDateTime(b []byte, base int) (LocalDateTime, []byte, error) {
sep := b[10]
if sep != 'T' && sep != ' ' && sep != 't' {
return dt, nil, unstable.NewParserError(b[10:11], base+10, "datetime separator is expected to be T or a space")
return dt, nil, unstable.NewParserError(b[10:11], "datetime separator is expected to be T or a space")
}
t, rest, err := parseLocalTime(b[11:], base+11)
t, rest, err := parseLocalTime(b[11:])
if err != nil {
return dt, nil, err
}
@@ -181,53 +188,53 @@ func parseLocalDateTime(b []byte, base int) (LocalDateTime, []byte, error) {
// parseLocalTime is a bit different because it also returns the remaining
// []byte that is didn't need. This is to allow parseDateTime to parse those
// remaining bytes as a timezone.
func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
func parseLocalTime(b []byte) (LocalTime, []byte, error) {
var (
nspow = [10]int{0, 1e8, 1e7, 1e6, 1e5, 1e4, 1e3, 1e2, 1e1, 1e0}
t LocalTime
)
// check if b matches to have expected format HH:MM:SS[.NNNNNN]
const localTimeByteLen = 8
if len(b) < localTimeByteLen {
return t, nil, unstable.NewParserError(b, base, "times are expected to have the format HH:MM:SS[.NNNNNN]")
return t, nil, unstable.NewParserError(b, "times are expected to have the format HH:MM:SS[.NNNNNN]")
}
var err error
t.Hour, err = parseDecimalDigits(b[0:2], base)
t.Hour, err = parseDecimalDigits(b[0:2])
if err != nil {
return t, nil, err
}
if t.Hour > 23 {
return t, nil, unstable.NewParserError(b[0:2], base, "hour cannot be greater 23")
return t, nil, unstable.NewParserError(b[0:2], "hour cannot be greater 23")
}
if b[2] != ':' {
return t, nil, unstable.NewParserError(b[2:3], base+2, "expecting colon between hours and minutes")
return t, nil, unstable.NewParserError(b[2:3], "expecting colon between hours and minutes")
}
t.Minute, err = parseDecimalDigits(b[3:5], base+3)
t.Minute, err = parseDecimalDigits(b[3:5])
if err != nil {
return t, nil, err
}
if t.Minute > 59 {
return t, nil, unstable.NewParserError(b[3:5], base+3, "minutes cannot be greater 59")
return t, nil, unstable.NewParserError(b[3:5], "minutes cannot be greater 59")
}
if b[5] != ':' {
return t, nil, unstable.NewParserError(b[5:6], base+5, "expecting colon between minutes and seconds")
return t, nil, unstable.NewParserError(b[5:6], "expecting colon between minutes and seconds")
}
t.Second, err = parseDecimalDigits(b[6:8], base+6)
t.Second, err = parseDecimalDigits(b[6:8])
if err != nil {
return t, nil, err
}
if t.Second > 59 {
return t, nil, unstable.NewParserError(b[6:8], base+6, "seconds cannot be greater than 59")
return t, nil, unstable.NewParserError(b[6:8], "seconds cannot be greater than 59")
}
b = b[8:]
base += 8
if len(b) >= 1 && b[0] == '.' {
frac := 0
@@ -237,7 +244,7 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
for i, c := range b[1:] {
if !isDigit(c) {
if i == 0 {
return t, nil, unstable.NewParserError(b[0:1], base, "need at least one digit after fraction point")
return t, nil, unstable.NewParserError(b[0:1], "need at least one digit after fraction point")
}
break
}
@@ -245,6 +252,13 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
const maxFracPrecision = 9
if i >= maxFracPrecision {
// go-toml allows decoding fractional seconds
// beyond the supported precision of 9
// digits. It truncates the fractional component
// to the supported precision and ignores the
// remaining digits.
//
// https://github.com/pelletier/go-toml/discussions/707
continue
}
@@ -254,7 +268,7 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
}
if precision == 0 {
return t, nil, unstable.NewParserError(b[:1], base, "nanoseconds need at least one digit")
return t, nil, unstable.NewParserError(b[:1], "nanoseconds need at least one digit")
}
t.Nanosecond = frac * nspow[precision]
@@ -265,35 +279,35 @@ func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
return t, b, nil
}
func parseFloat(b []byte, base int) (float64, error) {
func parseFloat(b []byte) (float64, error) {
if len(b) == 4 && (b[0] == '+' || b[0] == '-') && b[1] == 'n' && b[2] == 'a' && b[3] == 'n' {
return math.NaN(), nil
}
cleaned, err := checkAndRemoveUnderscoresFloats(b, base)
cleaned, err := checkAndRemoveUnderscoresFloats(b)
if err != nil {
return 0, err
}
if cleaned[0] == '.' {
return 0, unstable.NewParserError(b, base, "float cannot start with a dot")
return 0, unstable.NewParserError(b, "float cannot start with a dot")
}
if cleaned[len(cleaned)-1] == '.' {
return 0, unstable.NewParserError(b, base, "float cannot end with a dot")
return 0, unstable.NewParserError(b, "float cannot end with a dot")
}
dotAlreadySeen := false
for i, c := range cleaned {
if c == '.' {
if dotAlreadySeen {
return 0, unstable.NewParserError(b[i:i+1], base+i, "float can have at most one decimal point")
return 0, unstable.NewParserError(b[i:i+1], "float can have at most one decimal point")
}
if !isDigit(cleaned[i-1]) {
return 0, unstable.NewParserError(b[i-1:i+1], base+i-1, "float decimal point must be preceded by a digit")
return 0, unstable.NewParserError(b[i-1:i+1], "float decimal point must be preceded by a digit")
}
if !isDigit(cleaned[i+1]) {
return 0, unstable.NewParserError(b[i:i+2], base+i, "float decimal point must be followed by a digit")
return 0, unstable.NewParserError(b[i:i+2], "float decimal point must be followed by a digit")
}
dotAlreadySeen = true
}
@@ -304,54 +318,54 @@ func parseFloat(b []byte, base int) (float64, error) {
start = 1
}
if cleaned[start] == '0' && len(cleaned) > start+1 && isDigit(cleaned[start+1]) {
return 0, unstable.NewParserError(b, base, "float integer part cannot have leading zeroes")
return 0, unstable.NewParserError(b, "float integer part cannot have leading zeroes")
}
f, err := strconv.ParseFloat(string(cleaned), 64)
if err != nil {
return 0, unstable.NewParserError(b, base, "unable to parse float: %w", err)
return 0, unstable.NewParserError(b, "unable to parse float: %w", err)
}
return f, nil
}
func parseIntHex(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2)
func parseIntHex(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:])
if err != nil {
return 0, err
}
i, err := strconv.ParseInt(string(cleaned), 16, 64)
if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse hexadecimal number: %w", err)
return 0, unstable.NewParserError(b, "couldn't parse hexadecimal number: %w", err)
}
return i, nil
}
func parseIntOct(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2)
func parseIntOct(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:])
if err != nil {
return 0, err
}
i, err := strconv.ParseInt(string(cleaned), 8, 64)
if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse octal number: %w", err)
return 0, unstable.NewParserError(b, "couldn't parse octal number: %w", err)
}
return i, nil
}
func parseIntBin(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2)
func parseIntBin(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:])
if err != nil {
return 0, err
}
i, err := strconv.ParseInt(string(cleaned), 2, 64)
if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse binary number: %w", err)
return 0, unstable.NewParserError(b, "couldn't parse binary number: %w", err)
}
return i, nil
@@ -361,8 +375,8 @@ func isSign(b byte) bool {
return b == '+' || b == '-'
}
func parseIntDec(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b, base)
func parseIntDec(b []byte) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b)
if err != nil {
return 0, err
}
@@ -374,18 +388,18 @@ func parseIntDec(b []byte, base int) (int64, error) {
}
if len(cleaned) > startIdx+1 && cleaned[startIdx] == '0' {
return 0, unstable.NewParserError(b, base, "leading zero not allowed on decimal number")
return 0, unstable.NewParserError(b, "leading zero not allowed on decimal number")
}
i, err := strconv.ParseInt(string(cleaned), 10, 64)
if err != nil {
return 0, unstable.NewParserError(b, base, "couldn't parse decimal number: %w", err)
return 0, unstable.NewParserError(b, "couldn't parse decimal number: %w", err)
}
return i, nil
}
func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
func checkAndRemoveUnderscoresIntegers(b []byte) ([]byte, error) {
start := 0
if b[start] == '+' || b[start] == '-' {
start++
@@ -396,11 +410,11 @@ func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
}
if b[start] == '_' {
return nil, unstable.NewParserError(b[start:start+1], base+start, "number cannot start with underscore")
return nil, unstable.NewParserError(b[start:start+1], "number cannot start with underscore")
}
if b[len(b)-1] == '_' {
return nil, unstable.NewParserError(b[len(b)-1:], base+len(b)-1, "number cannot end with underscore")
return nil, unstable.NewParserError(b[len(b)-1:], "number cannot end with underscore")
}
// fast path
@@ -422,7 +436,7 @@ func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
c := b[i]
if c == '_' {
if !before {
return nil, unstable.NewParserError(b[i-1:i+1], base+i-1, "number must have at least one digit between underscores")
return nil, unstable.NewParserError(b[i-1:i+1], "number must have at least one digit between underscores")
}
before = false
} else {
@@ -434,13 +448,13 @@ func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
return cleaned, nil
}
func checkAndRemoveUnderscoresFloats(b []byte, base int) ([]byte, error) {
func checkAndRemoveUnderscoresFloats(b []byte) ([]byte, error) {
if b[0] == '_' {
return nil, unstable.NewParserError(b[0:1], base, "number cannot start with underscore")
return nil, unstable.NewParserError(b[0:1], "number cannot start with underscore")
}
if b[len(b)-1] == '_' {
return nil, unstable.NewParserError(b[len(b)-1:], base+len(b)-1, "number cannot end with underscore")
return nil, unstable.NewParserError(b[len(b)-1:], "number cannot end with underscore")
}
// fast path
@@ -463,26 +477,27 @@ func checkAndRemoveUnderscoresFloats(b []byte, base int) ([]byte, error) {
switch c {
case '_':
if !before {
return nil, unstable.NewParserError(b[i-1:i+1], base+i-1, "number must have at least one digit between underscores")
return nil, unstable.NewParserError(b[i-1:i+1], "number must have at least one digit between underscores")
}
if i < len(b)-1 && (b[i+1] == 'e' || b[i+1] == 'E') {
return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore before exponent")
return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore before exponent")
}
before = false
case '+', '-':
// signed exponents
cleaned = append(cleaned, c)
before = false
case 'e', 'E':
if i < len(b)-1 && b[i+1] == '_' {
return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore after exponent")
return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore after exponent")
}
cleaned = append(cleaned, c)
case '.':
if i < len(b)-1 && b[i+1] == '_' {
return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore after decimal point")
return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore after decimal point")
}
if i > 0 && b[i-1] == '_' {
return nil, unstable.NewParserError(b[i-1:i], base+i-1, "cannot have underscore before decimal point")
return nil, unstable.NewParserError(b[i-1:i], "cannot have underscore before decimal point")
}
cleaned = append(cleaned, c)
default:
+21 -1
View File
@@ -2,6 +2,7 @@ package toml
import (
"fmt"
"reflect"
"strconv"
"strings"
@@ -99,7 +100,7 @@ func (e *DecodeError) Key() Key {
//
//nolint:funlen
func wrapDecodeError(document []byte, de *unstable.ParserError) *DecodeError {
offset := de.Offset
offset := subsliceOffset(document, de.Highlight)
errMessage := de.Error()
errLine, errColumn := positionAtEnd(document[:offset])
@@ -261,3 +262,22 @@ func positionAtEnd(b []byte) (row int, column int) {
return row, column
}
// subsliceOffset returns the byte offset of subslice within data.
// subslice must share the same backing array as data.
func subsliceOffset(data []byte, subslice []byte) int {
if len(subslice) == 0 {
return 0
}
// Use reflect to get the data pointers of both slices.
// This is safe because we're only reading the pointer values for comparison.
dataPtr := reflect.ValueOf(data).Pointer()
subPtr := reflect.ValueOf(subslice).Pointer()
offset := int(subPtr - dataPtr)
if offset < 0 || offset > len(data) {
panic("subslice is not within data")
}
return offset
}
-95
View File
@@ -171,7 +171,6 @@ line 5`,
err := wrapDecodeError(doc, &unstable.ParserError{
Highlight: hl,
Offset: start,
Message: e.msg,
})
@@ -287,100 +286,6 @@ func TestDecodeError_Position(t *testing.T) {
}
}
func TestDecodeError_PositionAfterComments(t *testing.T) {
examples := []struct {
name string
doc string
expectedRow int
expectedCol int
errContains string
}{
{
name: "invalid key after comment",
doc: "# comment\n= \"value\"",
expectedRow: 2,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key after multiple comments",
doc: "# line 1\n# line 2\n= \"value\"",
expectedRow: 3,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key after valid assignment and comment",
doc: "a = 1\n# comment\n= \"value\"",
expectedRow: 3,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key on first line",
doc: "= \"value\"",
expectedRow: 1,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key with leading whitespace",
doc: "# comment\n = \"value\"",
expectedRow: 2,
expectedCol: 3,
errContains: "invalid character at start of key",
},
}
for _, e := range examples {
t.Run(e.name, func(t *testing.T) {
var v map[string]interface{}
err := Unmarshal([]byte(e.doc), &v)
if err == nil {
t.Fatal("expected an error")
}
var derr *DecodeError
if !errors.As(err, &derr) {
t.Fatalf("expected DecodeError, got %T: %v", err, err)
}
row, col := derr.Position()
if row != e.expectedRow {
t.Errorf("row: got %d, want %d (error: %s)", row, e.expectedRow, derr.String())
}
if col != e.expectedCol {
t.Errorf("col: got %d, want %d (error: %s)", col, e.expectedCol, derr.String())
}
if !strings.Contains(derr.Error(), e.errContains) {
t.Errorf("error %q does not contain %q", derr.Error(), e.errContains)
}
})
}
}
func TestDecodeError_HumanStringAfterComments(t *testing.T) {
doc := "# comment\n= \"value\""
var v map[string]interface{}
err := Unmarshal([]byte(doc), &v)
if err == nil {
t.Fatal("expected an error")
}
var derr *DecodeError
if !errors.As(err, &derr) {
t.Fatalf("expected DecodeError, got %T: %v", err, err)
}
human := derr.String()
if !strings.Contains(human, "= \"value\"") {
t.Errorf("human-readable error should show the offending line, got:\n%s", human)
}
if !strings.Contains(human, "2|") {
t.Errorf("human-readable error should reference line 2, got:\n%s", human)
}
}
func TestStrictErrorUnwrap(t *testing.T) {
fo := bytes.NewBufferString(`
Missing = 1
+16 -12
View File
@@ -24,57 +24,61 @@ import (
// 0x9 => tab, ok
// 0xA - 0x1F => invalid
// 0x7F => invalid
func Utf8TomlValidAlreadyEscaped(p []byte) int {
consumed := 0
func Utf8TomlValidAlreadyEscaped(p []byte) []byte {
// Fast path. Check for and skip 8 bytes of ASCII characters per iteration.
for len(p) >= 8 {
// Combining two 32 bit loads allows the same code to be used
// for 32 and 64 bit platforms.
// The compiler can generate a 32bit load for first32 and second32
// on many platforms. See test/codegen/memcombine.go.
first32 := uint32(p[0]) | uint32(p[1])<<8 | uint32(p[2])<<16 | uint32(p[3])<<24
second32 := uint32(p[4]) | uint32(p[5])<<8 | uint32(p[6])<<16 | uint32(p[7])<<24
if (first32|second32)&0x80808080 != 0 {
// Found a non ASCII byte (>= RuneSelf).
break
}
for i, b := range p[:8] {
if InvalidASCII(b) {
return consumed + i
return p[i : i+1]
}
}
p = p[8:]
consumed += 8
}
n := len(p)
for i := 0; i < n; {
pi := p[i]
if pi < utf8.RuneSelf {
if InvalidASCII(pi) {
return consumed + i
return p[i : i+1]
}
i++
continue
}
x := first[pi]
if x == xx {
return consumed + i
// Illegal starter byte.
return p[i : i+1]
}
size := int(x & 7)
if i+size > n {
return consumed + i
// Short or invalid.
return p[i:n]
}
accept := acceptRanges[x>>4]
if c := p[i+1]; c < accept.lo || accept.hi < c {
return consumed + i
return p[i : i+2]
} else if size == 2 { //revive:disable:empty-block
} else if c := p[i+2]; c < locb || hicb < c {
return consumed + i
return p[i : i+3]
} else if size == 3 { //revive:disable:empty-block
} else if c := p[i+3]; c < locb || hicb < c {
return consumed + i
return p[i : i+4]
}
i += size
}
return -1
return nil
}
// Utf8ValidNext returns the size of the next rune if valid, 0 otherwise.
+5 -5
View File
@@ -32,7 +32,7 @@ func (d LocalDate) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalDate) UnmarshalText(b []byte) error {
res, err := parseLocalDate(b, 0)
res, err := parseLocalDate(b)
if err != nil {
return err
}
@@ -75,9 +75,9 @@ func (d LocalTime) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalTime) UnmarshalText(b []byte) error {
res, left, err := parseLocalTime(b, 0)
res, left, err := parseLocalTime(b)
if err == nil && len(left) != 0 {
err = unstable.NewParserError(left, len(b)-len(left), "extra characters")
err = unstable.NewParserError(left, "extra characters")
}
if err != nil {
return err
@@ -109,9 +109,9 @@ func (d LocalDateTime) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalDateTime) UnmarshalText(data []byte) error {
res, left, err := parseLocalDateTime(data, 0)
res, left, err := parseLocalDateTime(data)
if err == nil && len(left) != 0 {
err = unstable.NewParserError(left, len(data)-len(left), "extra characters")
err = unstable.NewParserError(left, "extra characters")
}
if err != nil {
return err
+9 -12
View File
@@ -704,18 +704,15 @@ func (enc *Encoder) encodeMap(b []byte, ctx encoderCtx, v reflect.Value) ([]byte
for iter.Next() {
v := iter.Value()
// Handle nil values: convert nil pointers to zero value,
// skip nil interfaces and nil maps.
switch v.Kind() {
case reflect.Ptr:
if v.IsNil() {
if isNil(v) {
// For nil pointers, convert to zero value of the element type.
// This allows round-trip marshaling of maps with nil pointer values.
// For nil interfaces and nil maps, skip since we can't derive a type.
if v.Kind() == reflect.Ptr {
v = reflect.Zero(v.Type().Elem())
}
case reflect.Interface, reflect.Map:
if v.IsNil() {
} else {
continue
}
default:
}
k, err := enc.keyToString(iter.Key())
@@ -939,7 +936,7 @@ func (enc *Encoder) encodeTable(b []byte, ctx encoderCtx, t table) ([]byte, erro
if shouldOmitEmpty(kv.Options, kv.Value) {
continue
}
if kv.Options.omitzero && shouldOmitZero(kv.Options, kv.Value) {
if shouldOmitZero(kv.Options, kv.Value) {
continue
}
hasNonEmptyKV = true
@@ -961,7 +958,7 @@ func (enc *Encoder) encodeTable(b []byte, ctx encoderCtx, t table) ([]byte, erro
if shouldOmitEmpty(table.Options, table.Value) {
continue
}
if table.Options.omitzero && shouldOmitZero(table.Options, table.Value) {
if shouldOmitZero(table.Options, table.Value) {
continue
}
if first {
@@ -998,7 +995,7 @@ func (enc *Encoder) encodeTableInline(b []byte, ctx encoderCtx, t table) ([]byte
if shouldOmitEmpty(kv.Options, kv.Value) {
continue
}
if kv.Options.omitzero && shouldOmitZero(kv.Options, kv.Value) {
if shouldOmitZero(kv.Options, kv.Value) {
continue
}
+6 -8
View File
@@ -54,12 +54,10 @@ func (s *strict) MissingTable(node *unstable.Node) {
return
}
loc, offset := s.keyLocation(node)
s.missing = append(s.missing, unstable.ParserError{
Highlight: loc,
Highlight: s.keyLocation(node),
Message: "missing table",
Key: s.key.Key(),
Offset: offset,
})
}
@@ -68,12 +66,10 @@ func (s *strict) MissingField(node *unstable.Node) {
return
}
loc, offset := s.keyLocation(node)
s.missing = append(s.missing, unstable.ParserError{
Highlight: loc,
Highlight: s.keyLocation(node),
Message: "missing field",
Key: s.key.Key(),
Offset: offset,
})
}
@@ -94,7 +90,7 @@ func (s *strict) Error(doc []byte) error {
return err
}
func (s *strict) keyLocation(node *unstable.Node) ([]byte, int) {
func (s *strict) keyLocation(node *unstable.Node) []byte {
k := node.Key()
hasOne := k.Next()
@@ -102,6 +98,7 @@ func (s *strict) keyLocation(node *unstable.Node) ([]byte, int) {
panic("should not be called with empty key")
}
// Get the range from the first key to the last key.
firstRaw := k.Node().Raw
lastRaw := firstRaw
@@ -109,8 +106,9 @@ func (s *strict) keyLocation(node *unstable.Node) ([]byte, int) {
lastRaw = k.Node().Raw
}
// Compute the slice from the document using the ranges.
start := firstRaw.Offset
end := lastRaw.Offset + lastRaw.Length
return s.doc[start:end], int(start)
return s.doc[start:end]
}
+20 -20
View File
@@ -625,7 +625,7 @@ func (d *decoder) handleTable(key unstable.Iterator, v reflect.Value) (reflect.V
}
}
}
return reflect.Value{}, unstable.NewParserError(key.Node().Data, int(key.Node().Raw.Offset), "cannot store a table in a slice")
return reflect.Value{}, unstable.NewParserError(key.Node().Data, "cannot store a table in a slice")
}
if key.Next() {
// Still scoping the key
@@ -748,7 +748,7 @@ func (d *decoder) tryTextUnmarshaler(node *unstable.Node, v reflect.Value) (bool
if v.CanAddr() && v.Addr().Type().Implements(textUnmarshalerType) {
err := v.Addr().Interface().(encoding.TextUnmarshaler).UnmarshalText(node.Data)
if err != nil {
return false, unstable.NewParserError(d.p.Raw(node.Raw), int(node.Raw.Offset), "%w", err)
return false, unstable.NewParserError(d.p.Raw(node.Raw), "%w", err)
}
return true, nil
@@ -896,7 +896,7 @@ func (d *decoder) unmarshalInlineTable(itable *unstable.Node, v reflect.Value) e
}
return d.unmarshalInlineTable(itable, elem)
default:
return unstable.NewParserError(d.p.Raw(itable.Raw), int(itable.Raw.Offset), "cannot store inline table in Go type %s", v.Kind())
return unstable.NewParserError(d.p.Raw(itable.Raw), "cannot store inline table in Go type %s", v.Kind())
}
it := itable.Children()
@@ -916,26 +916,26 @@ func (d *decoder) unmarshalInlineTable(itable *unstable.Node, v reflect.Value) e
}
func (d *decoder) unmarshalDateTime(value *unstable.Node, v reflect.Value) error {
dt, err := parseDateTime(value.Data, int(value.Raw.Offset))
dt, err := parseDateTime(value.Data)
if err != nil {
return err
}
if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("datetime", v.Type()))
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("datetime", v.Type()))
}
v.Set(reflect.ValueOf(dt))
return nil
}
func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) error {
ld, err := parseLocalDate(value.Data, int(value.Raw.Offset))
ld, err := parseLocalDate(value.Data)
if err != nil {
return err
}
if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local date", v.Type()))
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local date", v.Type()))
}
if v.Type() == timeType {
v.Set(reflect.ValueOf(ld.AsTime(time.Local)))
@@ -946,34 +946,34 @@ func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) erro
}
func (d *decoder) unmarshalLocalTime(value *unstable.Node, v reflect.Value) error {
lt, rest, err := parseLocalTime(value.Data, int(value.Raw.Offset))
lt, rest, err := parseLocalTime(value.Data)
if err != nil {
return err
}
if len(rest) > 0 {
return unstable.NewParserError(rest, int(value.Raw.Offset)+len(value.Data)-len(rest), "extra characters at the end of a local time")
return unstable.NewParserError(rest, "extra characters at the end of a local time")
}
if v.Kind() != reflect.Interface {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local time", v.Type()))
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local time", v.Type()))
}
v.Set(reflect.ValueOf(lt))
return nil
}
func (d *decoder) unmarshalLocalDateTime(value *unstable.Node, v reflect.Value) error {
ldt, rest, err := parseLocalDateTime(value.Data, int(value.Raw.Offset))
ldt, rest, err := parseLocalDateTime(value.Data)
if err != nil {
return err
}
if len(rest) > 0 {
return unstable.NewParserError(rest, int(value.Raw.Offset)+len(value.Data)-len(rest), "extra characters at the end of a local date time")
return unstable.NewParserError(rest, "extra characters at the end of a local date time")
}
if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local datetime", v.Type()))
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local datetime", v.Type()))
}
if v.Type() == timeType {
v.Set(reflect.ValueOf(ldt.AsTime(time.Local)))
@@ -992,14 +992,14 @@ func (d *decoder) unmarshalBool(value *unstable.Node, v reflect.Value) error {
case reflect.Interface:
v.Set(reflect.ValueOf(b))
default:
return unstable.NewParserError(value.Data, int(value.Raw.Offset), "cannot assign boolean to a %t", b)
return unstable.NewParserError(value.Data, "cannot assign boolean to a %t", b)
}
return nil
}
func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error {
f, err := parseFloat(value.Data, int(value.Raw.Offset))
f, err := parseFloat(value.Data)
if err != nil {
return err
}
@@ -1009,13 +1009,13 @@ func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error {
v.SetFloat(f)
case reflect.Float32:
if f > math.MaxFloat32 {
return unstable.NewParserError(value.Data, int(value.Raw.Offset), "number %f does not fit in a float32", f)
return unstable.NewParserError(value.Data, "number %f does not fit in a float32", f)
}
v.SetFloat(f)
case reflect.Interface:
v.Set(reflect.ValueOf(f))
default:
return unstable.NewParserError(value.Data, int(value.Raw.Offset), "float cannot be assigned to %s", v.Kind())
return unstable.NewParserError(value.Data, "float cannot be assigned to %s", v.Kind())
}
return nil
@@ -1048,7 +1048,7 @@ func (d *decoder) unmarshalInteger(value *unstable.Node, v reflect.Value) error
return d.unmarshalFloat(value, v)
}
i, err := parseInteger(value.Data, int(value.Raw.Offset))
i, err := parseInteger(value.Data)
if err != nil {
return err
}
@@ -1116,7 +1116,7 @@ func (d *decoder) unmarshalInteger(value *unstable.Node, v reflect.Value) error
case reflect.Interface:
r = reflect.ValueOf(i)
default:
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("integer", v.Type()))
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("integer", v.Type()))
}
if !r.Type().AssignableTo(v.Type()) {
@@ -1135,7 +1135,7 @@ func (d *decoder) unmarshalString(value *unstable.Node, v reflect.Value) error {
case reflect.Interface:
v.Set(reflect.ValueOf(string(value.Data)))
default:
return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("string", v.Type()))
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("string", v.Type()))
}
return nil
+3 -7
View File
@@ -28,16 +28,12 @@ func (c *Iterator) Next() bool {
if c.nodes == nil {
return false
}
nodes := *c.nodes
if !c.started {
c.started = true
} else {
idx := c.idx
if idx >= 0 && int(idx) < len(nodes) {
c.idx = nodes[idx].next
}
} else if c.idx >= 0 {
c.idx = (*c.nodes)[c.idx].next
}
return c.idx >= 0 && int(c.idx) < len(nodes)
return c.idx >= 0 && int(c.idx) < len(*c.nodes)
}
// IsLast returns true if the current node of the iterator is the last
+1 -1
View File
@@ -35,7 +35,7 @@ func BenchmarkScanComments(b *testing.B) {
b.ResetTimer()
for i := 0; i < b.N; i++ {
_, _, _ = scanComment(input, 0)
_, _, _ = scanComment(input)
}
})
}
+72 -62
View File
@@ -16,7 +16,6 @@ type ParserError struct {
Highlight []byte
Message string
Key []string // optional
Offset int
}
// Error is the implementation of the error interface.
@@ -28,10 +27,9 @@ func (e *ParserError) Error() string {
//
// Warning: Highlight needs to be a subslice of Parser.data, so only slices
// returned by Parser.Raw are valid candidates.
func NewParserError(highlight []byte, offset int, format string, args ...interface{}) error {
func NewParserError(highlight []byte, format string, args ...interface{}) error {
return &ParserError{
Highlight: highlight,
Offset: offset,
Message: fmt.Errorf(format, args...).Error(),
}
}
@@ -66,8 +64,14 @@ func (p *Parser) Data() []byte {
return p.data
}
func (p *Parser) offsetOf(b []byte) int {
return len(p.data) - len(b)
// Range returns a range description that corresponds to a given slice of the
// input. If the argument is not a subslice of the parser input, this function
// panics.
func (p *Parser) Range(b []byte) Range {
return Range{
Offset: uint32(p.subsliceOffset(b)), //nolint:gosec // TOML documents are small
Length: uint32(len(b)), //nolint:gosec // TOML documents are small
}
}
// rangeOfToken computes the Range of a token given the remaining bytes after the token.
@@ -78,6 +82,13 @@ func (p *Parser) rangeOfToken(token, rest []byte) Range {
return Range{Offset: uint32(offset), Length: uint32(len(token))} //nolint:gosec // TOML documents are small
}
// subsliceOffset returns the byte offset of subslice b within p.data.
// b must be a suffix (tail) of p.data.
func (p *Parser) subsliceOffset(b []byte) int {
// b is a suffix of p.data, so its offset is len(p.data) - len(b)
return len(p.data) - len(b)
}
// Raw returns the slice corresponding to the bytes in the given range.
func (p *Parser) Raw(raw Range) []byte {
return p.data[raw.Offset : raw.Offset+raw.Length]
@@ -187,16 +198,16 @@ func (p *Parser) parseNewline(b []byte) ([]byte, error) {
}
if b[0] == '\r' {
_, rest, err := scanWindowsNewline(b, p.offsetOf(b))
_, rest, err := scanWindowsNewline(b)
return rest, err
}
return nil, NewParserError(b[0:1], p.offsetOf(b), "expected newline but got %#U", b[0])
return nil, NewParserError(b[0:1], "expected newline but got %#U", b[0])
}
func (p *Parser) parseComment(b []byte) (reference, []byte, error) {
ref := invalidReference
data, rest, err := scanComment(b, p.offsetOf(b))
data, rest, err := scanComment(b)
if p.KeepComments && err == nil {
ref = p.builder.Push(Node{
Kind: Comment,
@@ -280,12 +291,12 @@ func (p *Parser) parseArrayTable(b []byte) (reference, []byte, error) {
p.builder.AttachChild(ref, k)
b = p.parseWhitespace(b)
b, err = expect(']', b, p.offsetOf(b))
b, err = expect(']', b)
if err != nil {
return ref, nil, err
}
b, err = expect(']', b, p.offsetOf(b))
b, err = expect(']', b)
return ref, b, err
}
@@ -310,7 +321,7 @@ func (p *Parser) parseStdTable(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b)
b, err = expect(']', b, p.offsetOf(b))
b, err = expect(']', b)
return ref, b, err
}
@@ -334,10 +345,10 @@ func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b)
if len(b) == 0 {
return invalidReference, nil, NewParserError(startB[:len(startB)-len(b)], p.offsetOf(startB), "expected = after a key, but the document ends there")
return invalidReference, nil, NewParserError(startB[:len(startB)-len(b)], "expected = after a key, but the document ends there")
}
b, err = expect('=', b, p.offsetOf(b))
b, err = expect('=', b)
if err != nil {
return invalidReference, nil, err
}
@@ -352,10 +363,9 @@ func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
p.builder.Chain(valRef, key)
p.builder.AttachChild(ref, valRef)
// Set Raw to span the entire key-value expression.
// Access the node directly in the slice to avoid the write barrier
// that NodeAt's nodes-pointer setup would trigger.
p.builder.tree.nodes[ref].Raw = p.rangeOfToken(startB[:len(startB)-len(b)], b)
// Set Raw to span the entire key-value expression
node := p.builder.NodeAt(ref)
node.Raw = p.rangeOfToken(startB[:len(startB)-len(b)], b)
return ref, b, err
}
@@ -366,7 +376,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
ref := invalidReference
if len(b) == 0 {
return ref, nil, NewParserError(b, p.offsetOf(b), "expected value, not eof")
return ref, nil, NewParserError(b, "expected value, not eof")
}
var err error
@@ -411,7 +421,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
return ref, b, err
case 't':
if !scanFollowsTrue(b) {
return ref, nil, NewParserError(atmost(b, 4), p.offsetOf(b), "expected 'true'")
return ref, nil, NewParserError(atmost(b, 4), "expected 'true'")
}
ref = p.builder.Push(Node{
@@ -422,7 +432,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
return ref, b[4:], nil
case 'f':
if !scanFollowsFalse(b) {
return ref, nil, NewParserError(atmost(b, 5), p.offsetOf(b), "expected 'false'")
return ref, nil, NewParserError(atmost(b, 5), "expected 'false'")
}
ref = p.builder.Push(Node{
@@ -449,7 +459,7 @@ func atmost(b []byte, n int) []byte {
}
func (p *Parser) parseLiteralString(b []byte) ([]byte, []byte, []byte, error) {
v, rest, err := scanLiteralString(b, p.offsetOf(b))
v, rest, err := scanLiteralString(b)
if err != nil {
return nil, nil, nil, err
}
@@ -481,7 +491,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b)
if len(b) == 0 {
return parent, nil, NewParserError(previousB[:1], p.offsetOf(previousB), "inline table is incomplete")
return parent, nil, NewParserError(previousB[:1], "inline table is incomplete")
}
if b[0] == '}' {
@@ -489,7 +499,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
}
if !first {
b, err = expect(',', b, p.offsetOf(b))
b, err = expect(',', b)
if err != nil {
return parent, nil, err
}
@@ -513,7 +523,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
first = false
}
rest, err := expect('}', b, p.offsetOf(b))
rest, err := expect('}', b)
return parent, rest, err
}
@@ -562,7 +572,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
}
if len(b) == 0 {
return parent, nil, NewParserError(arrayStart[:1], p.offsetOf(arrayStart), "array is incomplete")
return parent, nil, NewParserError(arrayStart[:1], "array is incomplete")
}
if b[0] == ']' {
@@ -571,7 +581,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
if b[0] == ',' {
if first {
return parent, nil, NewParserError(b[0:1], p.offsetOf(b), "array cannot start with comma")
return parent, nil, NewParserError(b[0:1], "array cannot start with comma")
}
b = b[1:]
@@ -583,7 +593,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
addChild(cref)
}
} else if !first {
return parent, nil, NewParserError(b[0:1], p.offsetOf(b), "array elements must be separated by commas")
return parent, nil, NewParserError(b[0:1], "array elements must be separated by commas")
}
// TOML allows trailing commas in arrays.
@@ -610,7 +620,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
first = false
}
rest, err := expect(']', b, p.offsetOf(b))
rest, err := expect(']', b)
return parent, rest, err
}
@@ -665,7 +675,7 @@ func (p *Parser) parseOptionalWhitespaceCommentNewline(b []byte) (reference, []b
}
func (p *Parser) parseMultilineLiteralString(b []byte) ([]byte, []byte, []byte, error) {
token, rest, err := scanMultilineLiteralString(b, p.offsetOf(b))
token, rest, err := scanMultilineLiteralString(b)
if err != nil {
return nil, nil, nil, err
}
@@ -694,7 +704,7 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
// mlb-quotes = 1*2quotation-mark
// mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// mlb-escaped-nl = escape ws newline *( wschar / newline )
token, escaped, rest, err := scanMultilineBasicString(b, p.offsetOf(b))
token, escaped, rest, err := scanMultilineBasicString(b)
if err != nil {
return nil, nil, nil, err
}
@@ -711,15 +721,14 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
// fast path
startIdx := i
endIdx := len(token) - len(`"""`)
tokenBase := p.offsetOf(token)
if !escaped {
str := token[startIdx:endIdx]
invalidIdx := characters.Utf8TomlValidAlreadyEscaped(str)
if invalidIdx < 0 {
highlight := characters.Utf8TomlValidAlreadyEscaped(str)
if len(highlight) == 0 {
return token, str, rest, nil
}
return nil, nil, nil, NewParserError(str[invalidIdx:invalidIdx+1], tokenBase+startIdx+invalidIdx, "invalid UTF-8")
return nil, nil, nil, NewParserError(highlight, "invalid UTF-8")
}
var builder bytes.Buffer
@@ -784,14 +793,14 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
case 'e':
builder.WriteByte(0x1B)
case 'u':
x, err := hexToRune(atmost(token[i+1:], 4), tokenBase+i+1, 4)
x, err := hexToRune(atmost(token[i+1:], 4), 4)
if err != nil {
return nil, nil, nil, err
}
builder.WriteRune(x)
i += 4
case 'U':
x, err := hexToRune(atmost(token[i+1:], 8), tokenBase+i+1, 8)
x, err := hexToRune(atmost(token[i+1:], 8), 8)
if err != nil {
return nil, nil, nil, err
}
@@ -799,13 +808,13 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
builder.WriteRune(x)
i += 8
default:
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid escaped character %#U", c)
return nil, nil, nil, NewParserError(token[i:i+1], "invalid escaped character %#U", c)
}
i++
} else {
size := characters.Utf8ValidNext(token[i:])
if size == 0 {
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid character %#U", c)
return nil, nil, nil, NewParserError(token[i:i+1], "invalid character %#U", c)
}
builder.Write(token[i : i+size])
i += size
@@ -860,9 +869,12 @@ func (p *Parser) parseKey(b []byte) (reference, []byte, error) {
func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) {
if len(b) == 0 {
return nil, nil, nil, NewParserError(b, p.offsetOf(b), "expected key but found none")
return nil, nil, nil, NewParserError(b, "expected key but found none")
}
// simple-key = quoted-key / unquoted-key
// unquoted-key = 1*( ALPHA / DIGIT / %x2D / %x5F ) ; A-Z / a-z / 0-9 / - / _
// quoted-key = basic-string / literal-string
switch {
case b[0] == '\'':
return p.parseLiteralString(b)
@@ -872,7 +884,7 @@ func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) {
key, rest = scanUnquotedKey(b)
return key, key, rest, nil
default:
return nil, nil, nil, NewParserError(b[0:1], p.offsetOf(b), "invalid character at start of key: %c", b[0])
return nil, nil, nil, NewParserError(b[0:1], "invalid character at start of key: %c", b[0])
}
}
@@ -892,7 +904,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
// escape-seq-char =/ %x74 ; t tab U+0009
// escape-seq-char =/ %x75 4HEXDIG ; uXXXX U+XXXX
// escape-seq-char =/ %x55 8HEXDIG ; UXXXXXXXX U+XXXXXXXX
token, escaped, rest, err := scanBasicString(b, p.offsetOf(b))
token, escaped, rest, err := scanBasicString(b)
if err != nil {
return nil, nil, nil, err
}
@@ -903,15 +915,13 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
// Fast path. If there is no escape sequence, the string should just be
// an UTF-8 encoded string, which is the same as Go. In that case,
// validate the string and return a direct reference to the buffer.
tokenBase := p.offsetOf(token)
if !escaped {
str := token[startIdx:endIdx]
invalidIdx := characters.Utf8TomlValidAlreadyEscaped(str)
if invalidIdx < 0 {
highlight := characters.Utf8TomlValidAlreadyEscaped(str)
if len(highlight) == 0 {
return token, str, rest, nil
}
return nil, nil, nil, NewParserError(str[invalidIdx:invalidIdx+1], tokenBase+startIdx+invalidIdx, "invalid UTF-8")
return nil, nil, nil, NewParserError(highlight, "invalid UTF-8")
}
i := startIdx
@@ -942,7 +952,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
case 'e':
builder.WriteByte(0x1B)
case 'u':
x, err := hexToRune(token[i+1:len(token)-1], tokenBase+i+1, 4)
x, err := hexToRune(token[i+1:len(token)-1], 4)
if err != nil {
return nil, nil, nil, err
}
@@ -950,7 +960,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
builder.WriteRune(x)
i += 4
case 'U':
x, err := hexToRune(token[i+1:len(token)-1], tokenBase+i+1, 8)
x, err := hexToRune(token[i+1:len(token)-1], 8)
if err != nil {
return nil, nil, nil, err
}
@@ -958,13 +968,13 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
builder.WriteRune(x)
i += 8
default:
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid escaped character %#U", c)
return nil, nil, nil, NewParserError(token[i:i+1], "invalid escaped character %#U", c)
}
i++
} else {
size := characters.Utf8ValidNext(token[i:])
if size == 0 {
return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid character %#U", c)
return nil, nil, nil, NewParserError(token[i:i+1], "invalid character %#U", c)
}
builder.Write(token[i : i+size])
i += size
@@ -974,9 +984,9 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
return token, builder.Bytes(), rest, nil
}
func hexToRune(b []byte, base int, length int) (rune, error) {
func hexToRune(b []byte, length int) (rune, error) {
if len(b) < length {
return -1, NewParserError(b, base, "unicode point needs %d character, not %d", length, len(b))
return -1, NewParserError(b, "unicode point needs %d character, not %d", length, len(b))
}
b = b[:length]
@@ -991,13 +1001,13 @@ func hexToRune(b []byte, base int, length int) (rune, error) {
case 'A' <= c && c <= 'F':
d = uint32(c - 'A' + 10)
default:
return -1, NewParserError(b[i:i+1], base+i, "non-hex character")
return -1, NewParserError(b[i:i+1], "non-hex character")
}
r = r*16 + d
}
if r > unicode.MaxRune || 0xD800 <= r && r < 0xE000 {
return -1, NewParserError(b, base, "escape sequence is invalid Unicode code point")
return -1, NewParserError(b, "escape sequence is invalid Unicode code point")
}
return rune(r), nil
@@ -1017,7 +1027,7 @@ func (p *Parser) parseIntOrFloatOrDateTime(b []byte) (reference, []byte, error)
switch b[0] {
case 'i':
if !scanFollowsInf(b) {
return invalidReference, nil, NewParserError(atmost(b, 3), p.offsetOf(b), "expected 'inf'")
return invalidReference, nil, NewParserError(atmost(b, 3), "expected 'inf'")
}
return p.builder.Push(Node{
@@ -1027,7 +1037,7 @@ func (p *Parser) parseIntOrFloatOrDateTime(b []byte) (reference, []byte, error)
}), b[3:], nil
case 'n':
if !scanFollowsNan(b) {
return invalidReference, nil, NewParserError(atmost(b, 3), p.offsetOf(b), "expected 'nan'")
return invalidReference, nil, NewParserError(atmost(b, 3), "expected 'nan'")
}
return p.builder.Push(Node{
@@ -1186,7 +1196,7 @@ func (p *Parser) scanIntOrFloat(b []byte) (reference, []byte, error) {
}), b[i+3:], nil
}
return invalidReference, nil, NewParserError(b[i:i+1], p.offsetOf(b)+i, "unexpected character 'i' while scanning for a number")
return invalidReference, nil, NewParserError(b[i:i+1], "unexpected character 'i' while scanning for a number")
}
if c == 'n' {
@@ -1198,14 +1208,14 @@ func (p *Parser) scanIntOrFloat(b []byte) (reference, []byte, error) {
}), b[i+3:], nil
}
return invalidReference, nil, NewParserError(b[i:i+1], p.offsetOf(b)+i, "unexpected character 'n' while scanning for a number")
return invalidReference, nil, NewParserError(b[i:i+1], "unexpected character 'n' while scanning for a number")
}
break
}
if i == 0 {
return invalidReference, b, NewParserError(b, p.offsetOf(b), "incomplete number")
return invalidReference, b, NewParserError(b, "incomplete number")
}
kind := Integer
@@ -1242,13 +1252,13 @@ func isValidBinaryRune(r byte) bool {
return r == '0' || r == '1' || r == '_'
}
func expect(x byte, b []byte, base int) ([]byte, error) {
func expect(x byte, b []byte) ([]byte, error) {
if len(b) == 0 {
return nil, NewParserError(b, base, "expected character %c but the document ended here", x)
return nil, NewParserError(b, "expected character %c but the document ended here", x)
}
if b[0] != x {
return nil, NewParserError(b[0:1], base, "expected character %c", x)
return nil, NewParserError(b[0:1], "expected character %c", x)
}
return b[1:], nil
-91
View File
@@ -1,7 +1,6 @@
package unstable
import (
"errors"
"fmt"
"strconv"
"strings"
@@ -674,96 +673,6 @@ key3 = "value3"
assert.Equal(t, []string{"key1", "key2", "key3"}, keys)
}
func TestErrorOffsetAfterComment(t *testing.T) {
input := []byte("# comment\n= \"value\"")
p := Parser{}
p.Reset(input)
for p.NextExpression() {
}
err := p.Error()
if err == nil {
t.Fatal("expected an error")
}
var perr *ParserError
if !errors.As(err, &perr) {
t.Fatalf("expected ParserError, got %T", err)
}
if perr.Offset != 10 {
t.Errorf("offset: got %d, want 10", perr.Offset)
}
shape := p.Shape(Range{Offset: uint32(perr.Offset), Length: uint32(len(perr.Highlight))})
if shape.Start.Line != 2 || shape.Start.Column != 1 {
t.Errorf("position: got %d:%d, want 2:1", shape.Start.Line, shape.Start.Column)
}
}
func TestErrorHighlightPositions(t *testing.T) {
examples := []struct {
desc string
input string
wantLine int
wantColumn int
}{
{
desc: "invalid key start after comment",
input: "# comment\n= \"value\"",
wantLine: 2,
wantColumn: 1,
},
{
desc: "invalid key start on first line",
input: "= \"value\"",
wantLine: 1,
wantColumn: 1,
},
{
desc: "invalid key after multiple comments",
input: "# comment 1\n# comment 2\n= \"value\"",
wantLine: 3,
wantColumn: 1,
},
{
desc: "invalid key after valid key-value",
input: "a = 1\n= \"value\"",
wantLine: 2,
wantColumn: 1,
},
{
desc: "invalid key after whitespace on line",
input: "a = 1\n = \"value\"",
wantLine: 2,
wantColumn: 3,
},
}
for _, e := range examples {
t.Run(e.desc, func(t *testing.T) {
p := Parser{}
p.Reset([]byte(e.input))
for p.NextExpression() {
}
err := p.Error()
if err == nil {
t.Fatal("expected an error")
}
var perr *ParserError
if !errors.As(err, &perr) {
t.Fatalf("expected ParserError, got %T", err)
}
shape := p.Shape(Range{Offset: uint32(perr.Offset), Length: uint32(len(perr.Highlight))})
if shape.Start.Line != e.wantLine {
t.Errorf("line: got %d, want %d", shape.Start.Line, e.wantLine)
}
if shape.Start.Column != e.wantColumn {
t.Errorf("column: got %d, want %d", shape.Start.Column, e.wantColumn)
}
})
}
}
func ExampleParser() {
doc := `
hello = "world"
+72 -27
View File
@@ -47,31 +47,48 @@ func isUnquotedKeyChar(r byte) bool {
return (r >= 'A' && r <= 'Z') || (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9') || r == '-' || r == '_'
}
func scanLiteralString(b []byte, base int) ([]byte, []byte, error) {
func scanLiteralString(b []byte) ([]byte, []byte, error) {
// literal-string = apostrophe *literal-char apostrophe
// apostrophe = %x27 ; ' apostrophe
// literal-char = %x09 / %x20-26 / %x28-7E / non-ascii
for i := 1; i < len(b); {
switch b[i] {
case '\'':
return b[:i+1], b[i+1:], nil
case '\n', '\r':
return nil, nil, NewParserError(b[i:i+1], base+i, "literal strings cannot have new lines")
return nil, nil, NewParserError(b[i:i+1], "literal strings cannot have new lines")
}
size := characters.Utf8ValidNext(b[i:])
if size == 0 {
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character")
return nil, nil, NewParserError(b[i:i+1], "invalid character")
}
i += size
}
return nil, nil, NewParserError(b[len(b):], base+len(b), "unterminated literal string")
return nil, nil, NewParserError(b[len(b):], "unterminated literal string")
}
func scanMultilineLiteralString(b []byte, base int) ([]byte, []byte, error) {
func scanMultilineLiteralString(b []byte) ([]byte, []byte, error) {
// ml-literal-string = ml-literal-string-delim [ newline ] ml-literal-body
// ml-literal-string-delim
// ml-literal-string-delim = 3apostrophe
// ml-literal-body = *mll-content *( mll-quotes 1*mll-content ) [ mll-quotes ]
//
// mll-content = mll-char / newline
// mll-char = %x09 / %x20-26 / %x28-7E / non-ascii
// mll-quotes = 1*2apostrophe
for i := 3; i < len(b); {
switch b[i] {
case '\'':
if scanFollowsMultilineLiteralStringDelimiter(b[i:]) {
i += 3
// At that point we found 3 apostrophe, and i is the
// index of the byte after the third one. The scanner
// needs to be eager, because there can be an extra 2
// apostrophe that can be accepted at the end of the
// string.
if i >= len(b) || b[i] != '\'' {
return b[:i], b[i:], nil
}
@@ -83,39 +100,39 @@ func scanMultilineLiteralString(b []byte, base int) ([]byte, []byte, error) {
i++
if i < len(b) && b[i] == '\'' {
return nil, nil, NewParserError(b[i-3:i+1], base+i-3, "''' not allowed in multiline literal string")
return nil, nil, NewParserError(b[i-3:i+1], "''' not allowed in multiline literal string")
}
return b[:i], b[i:], nil
}
case '\r':
if len(b) < i+2 {
return nil, nil, NewParserError(b[len(b):], base+len(b), `need a \n after \r`)
return nil, nil, NewParserError(b[len(b):], `need a \n after \r`)
}
if b[i+1] != '\n' {
return nil, nil, NewParserError(b[i:i+2], base+i, `need a \n after \r`)
return nil, nil, NewParserError(b[i:i+2], `need a \n after \r`)
}
i += 2
i += 2 // skip the \n
continue
}
size := characters.Utf8ValidNext(b[i:])
if size == 0 {
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character")
return nil, nil, NewParserError(b[i:i+1], "invalid character")
}
i += size
}
return nil, nil, NewParserError(b[len(b):], base+len(b), `multiline literal string not terminated by '''`)
return nil, nil, NewParserError(b[len(b):], `multiline literal string not terminated by '''`)
}
func scanWindowsNewline(b []byte, base int) ([]byte, []byte, error) {
func scanWindowsNewline(b []byte) ([]byte, []byte, error) {
const lenCRLF = 2
if len(b) < lenCRLF {
return nil, nil, NewParserError(b, base, "windows new line expected")
return nil, nil, NewParserError(b, "windows new line expected")
}
if b[1] != '\n' {
return nil, nil, NewParserError(b, base, `windows new line should be \r\n`)
return nil, nil, NewParserError(b, `windows new line should be \r\n`)
}
return b[:lenCRLF], b[lenCRLF:], nil
@@ -134,7 +151,13 @@ func scanWhitespace(b []byte) ([]byte, []byte) {
return b, b[len(b):]
}
func scanComment(b []byte, base int) ([]byte, []byte, error) {
func scanComment(b []byte) ([]byte, []byte, error) {
// comment-start-symbol = %x23 ; #
// non-ascii = %x80-D7FF / %xE000-10FFFF
// non-eol = %x09 / %x20-7F / non-ascii
//
// comment = comment-start-symbol *non-eol
for i := 1; i < len(b); {
if b[i] == '\n' {
return b[:i], b[i:], nil
@@ -143,11 +166,11 @@ func scanComment(b []byte, base int) ([]byte, []byte, error) {
if i+1 < len(b) && b[i+1] == '\n' {
return b[:i+1], b[i+1:], nil
}
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character in comment")
return nil, nil, NewParserError(b[i:i+1], "invalid character in comment")
}
size := characters.Utf8ValidNext(b[i:])
if size == 0 {
return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character in comment")
return nil, nil, NewParserError(b[i:i+1], "invalid character in comment")
}
i += size
@@ -156,7 +179,12 @@ func scanComment(b []byte, base int) ([]byte, []byte, error) {
return b, b[len(b):], nil
}
func scanBasicString(b []byte, base int) ([]byte, bool, []byte, error) {
func scanBasicString(b []byte) ([]byte, bool, []byte, error) {
// basic-string = quotation-mark *basic-char quotation-mark
// quotation-mark = %x22 ; "
// basic-char = basic-unescaped / escaped
// basic-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// escaped = escape escape-seq-char
escaped := false
i := 1
@@ -165,20 +193,31 @@ func scanBasicString(b []byte, base int) ([]byte, bool, []byte, error) {
case '"':
return b[:i+1], escaped, b[i+1:], nil
case '\n', '\r':
return nil, escaped, nil, NewParserError(b[i:i+1], base+i, "basic strings cannot have new lines")
return nil, escaped, nil, NewParserError(b[i:i+1], "basic strings cannot have new lines")
case '\\':
if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[i:i+1], base+i, "need a character after \\")
return nil, escaped, nil, NewParserError(b[i:i+1], "need a character after \\")
}
escaped = true
i++ // skip the next character
}
}
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `basic string not terminated by "`)
return nil, escaped, nil, NewParserError(b[len(b):], `basic string not terminated by "`)
}
func scanMultilineBasicString(b []byte, base int) ([]byte, bool, []byte, error) {
func scanMultilineBasicString(b []byte) ([]byte, bool, []byte, error) {
// ml-basic-string = ml-basic-string-delim [ newline ] ml-basic-body
// ml-basic-string-delim
// ml-basic-string-delim = 3quotation-mark
// ml-basic-body = *mlb-content *( mlb-quotes 1*mlb-content ) [ mlb-quotes ]
//
// mlb-content = mlb-char / newline / mlb-escaped-nl
// mlb-char = mlb-unescaped / escaped
// mlb-quotes = 1*2quotation-mark
// mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// mlb-escaped-nl = escape ws newline *( wschar / newline )
escaped := false
i := 3
@@ -188,6 +227,12 @@ func scanMultilineBasicString(b []byte, base int) ([]byte, bool, []byte, error)
if scanFollowsMultilineBasicStringDelimiter(b[i:]) {
i += 3
// At that point we found 3 apostrophe, and i is the
// index of the byte after the third one. The scanner
// needs to be eager, because there can be an extra 2
// apostrophe that can be accepted at the end of the
// string.
if i >= len(b) || b[i] != '"' {
return b[:i], escaped, b[i:], nil
}
@@ -199,27 +244,27 @@ func scanMultilineBasicString(b []byte, base int) ([]byte, bool, []byte, error)
i++
if i < len(b) && b[i] == '"' {
return nil, escaped, nil, NewParserError(b[i-3:i+1], base+i-3, `""" not allowed in multiline basic string`)
return nil, escaped, nil, NewParserError(b[i-3:i+1], `""" not allowed in multiline basic string`)
}
return b[:i], escaped, b[i:], nil
}
case '\\':
if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), "need a character after \\")
return nil, escaped, nil, NewParserError(b[len(b):], "need a character after \\")
}
escaped = true
i++ // skip the next character
case '\r':
if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `need a \n after \r`)
return nil, escaped, nil, NewParserError(b[len(b):], `need a \n after \r`)
}
if b[i+1] != '\n' {
return nil, escaped, nil, NewParserError(b[i:i+2], base+i, `need a \n after \r`)
return nil, escaped, nil, NewParserError(b[i:i+2], `need a \n after \r`)
}
i++ // skip the \n
}
}
return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `multiline basic string not terminated by """`)
return nil, escaped, nil, NewParserError(b[len(b):], `multiline basic string not terminated by """`)
}