Compare commits

..

8 Commits

Author SHA1 Message Date
Cursor Agent 89f970069c Remove Parser.Range and subsliceOffset
Range() existed to recover byte offsets from Highlight subslices.
Now that ParserError carries an explicit Offset field, Range() is
unnecessary. Remove it along with the private subsliceOffset helper
in ast.go.

Tests now use perr.Offset directly and construct Range literals
for Shape() calls.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 19:17:12 +00:00
Cursor Agent a646ffd9fa Make error position tracking explicit with Offset field on ParserError
Thread byte offset information through all error creation sites,
eliminating the need for SubsliceOffset to recover position from
pointer comparison.

Changes:
- Add Offset field to ParserError struct
- Add offset parameter to NewParserError
- Add Parser.offsetOf helper for suffix-length arithmetic
- Thread base offset through scanner functions (scanComment,
  scanBasicString, scanMultilineBasicString, scanLiteralString,
  scanMultilineLiteralString, scanWindowsNewline)
- Thread base offset through standalone functions (expect, hexToRune)
- Thread base offset through all decode functions (parseInteger,
  parseFloat, parseLocalDate, parseLocalTime, parseLocalDateTime,
  parseDateTime, checkAndRemoveUnderscores*)
- Update all unmarshaler call sites to pass value.Raw.Offset
- Update localtime.go UnmarshalText methods with base=0
- Update strict.go to populate Offset from key ranges
- Change wrapDecodeError to read de.Offset directly
- Change Utf8TomlValidAlreadyEscaped to return int index (-1 if valid)
  instead of a byte subslice
- Unexport SubsliceOffset (now only used internally by Range())

This makes error positions self-describing: each ParserError carries its
own byte offset, so callers no longer need the original document slice
and address arithmetic to determine where an error occurred.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 19:08:55 +00:00
Cursor Agent d75117e61f Consolidate subslice offset into a single SubsliceOffset function
Remove the private subsliceOffset methods from both parser.go and
errors.go. Replace them with a single exported SubsliceOffset function
in ast.go (next to the Range type it serves).

SubsliceOffset finds the byte offset by comparing element addresses:
&data[i] == &subslice[0]. This is well-defined Go pointer comparison
on elements of the same backing array.

This fixes the v2.3.0 regression (#1047) where the parser's
subsliceOffset used len(data) - len(b), which only works for suffix
slices, not arbitrary subslices like error highlights. It also removes
the reflect-based implementation from errors.go.

Fixes #1047

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 18:33:22 +00:00
Cursor Agent 19174a4293 Remove cap tricks, use address comparison for subslice offset
Replace cap(parent) - cap(subslice) with a straightforward scan
that compares element addresses: &data[i] == &subslice[0]. This is
well-defined Go pointer comparison on elements of the same backing
array, with no dependency on capacity semantics, reflect, or unsafe.

The scan is O(n) but only runs on error paths, and TOML documents
are small per the project's design constraints.

Also remove the Offset field from ParserError and the setErrOffset
machinery — the offset is computed at the point of consumption
(wrapDecodeError, Parser.Range) rather than cached on the error.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 18:17:55 +00:00
Cursor Agent 96ac48eb74 Remove optional offset and fallback, guarantee offset by construction
ParserError.Offset is now a plain exported int field, always set:
- The parser sets it via setErrOffset() when capturing parse errors
- strict.go sets it from the key's Raw range at construction
- wrapDecodeError computes it inline from cap(document) - cap(highlight)

This eliminates:
- The SetOffset/Offset() accessor methods and offsetValid flag
- The subsliceOffset fallback function in errors.go
- Any conditional logic around whether the offset is present

The offset is guaranteed by construction at every path that creates
or consumes a ParserError.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 17:25:32 +00:00
Cursor Agent f7136d052b Replace reflect-based subslice offset with cap arithmetic
Use cap(parent) - cap(subslice) to compute byte offsets between slices
that share a backing array. This is safe pure Go: subslicing preserves
the backing array and adjusts the capacity accordingly, so the
difference in capacities equals the byte offset.

This removes the reflect import from both errors.go and
unstable/parser.go, eliminating the last reflect-based pointer
arithmetic used for error position tracking.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 17:08:58 +00:00
Cursor Agent 154d80392f Cache error offset in ParserError for safer position tracking
Instead of requiring downstream consumers to re-derive the byte offset
from pointer arithmetic on the Highlight slice, compute and cache the
offset inside the parser at error-capture time via setErrOffset().

This is safer because:
- The parser is the one place where the backing-array guarantee is known
  to hold (Highlight is always a subslice of the parse buffer)
- Downstream consumers (wrapDecodeError) can use the cached offset
  directly, avoiding the need for pointer comparison
- Errors created outside the parser (strict.go) set the offset from
  existing Raw ranges, which are already correct by construction

Add ParserError.SetOffset/Offset methods for setting and retrieving the
cached offset. Update wrapDecodeError to prefer the cached offset when
available, falling back to subsliceOffset for backward compatibility.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 13:00:38 +00:00
Cursor Agent d528d3c6b4 Fix Parser.Range returning wrong offset for error highlights
The subsliceOffset method incorrectly computed offset as
len(p.data) - len(b), which only works when b is a suffix (tail) of
p.data. However, error highlights (ParserError.Highlight) are arbitrary
subslices from the middle of the input (e.g., b[0:1] from parseSimpleKey),
so their len has no relationship to their position.

This was a regression introduced in commit 3aaf147 (Remove unsafe package
usage) which replaced danger.SubsliceOffset (pointer arithmetic) with the
incorrect len-based approach.

Fix by using reflect.ValueOf().Pointer() to compute the actual byte
offset between slice data pointers, matching the approach already used
in errors.go:subsliceOffset.

Fixes #1047

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 12:40:14 +00:00
46 changed files with 496 additions and 403 deletions
+1 -1
View File
@@ -5,7 +5,7 @@ Thank you for your pull request!
Please read the Code changes section of the CONTRIBUTING.md file, Please read the Code changes section of the CONTRIBUTING.md file,
and make sure you have followed the instructions. and make sure you have followed the instructions.
https://git.ostiwe.com/ostiwe/go-toml/blob/v2/CONTRIBUTING.md#code-changes https://github.com/pelletier/go-toml/blob/v2/CONTRIBUTING.md#code-changes
--> -->
+6 -6
View File
@@ -21,7 +21,7 @@ improvement, or new features that weren't envisioned before. Sometimes, a
seemingly innocent question leads to the fix of a bug. Don't hesitate and ask seemingly innocent question leads to the fix of a bug. Don't hesitate and ask
away! away!
[discussions]: https://git.ostiwe.com/ostiwe/go-toml/discussions [discussions]: https://github.com/pelletier/go-toml/discussions
## Improve the documentation ## Improve the documentation
@@ -224,12 +224,12 @@ Checklist:
5. If new version is an alpha or beta only, check pre-release box. 5. If new version is an alpha or beta only, check pre-release box.
[issues-tracker]: https://git.ostiwe.com/ostiwe/go-toml/issues [issues-tracker]: https://github.com/pelletier/go-toml/issues
[bug-report]: https://git.ostiwe.com/ostiwe/go-toml/issues/new?template=bug_report.md [bug-report]: https://github.com/pelletier/go-toml/issues/new?template=bug_report.md
[pkg.go.dev]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml [pkg.go.dev]: https://pkg.go.dev/github.com/pelletier/go-toml
[readme]: ./README.md [readme]: ./README.md
[fork]: https://help.github.com/articles/fork-a-repo [fork]: https://help.github.com/articles/fork-a-repo
[pull-request]: https://help.github.com/en/articles/creating-a-pull-request [pull-request]: https://help.github.com/en/articles/creating-a-pull-request
[new-release]: https://git.ostiwe.com/ostiwe/go-toml/releases/new [new-release]: https://github.com/pelletier/go-toml/releases/new
[gh]: https://github.com/cli/cli [gh]: https://github.com/cli/cli
[pr-labels]: https://git.ostiwe.com/ostiwe/go-toml/blob/v2/.github/release.yml [pr-labels]: https://github.com/pelletier/go-toml/blob/v2/.github/release.yml
+22 -22
View File
@@ -4,21 +4,21 @@ Go library for the [TOML](https://toml.io/en/) format.
This library supports [TOML v1.0.0](https://toml.io/en/v1.0.0). This library supports [TOML v1.0.0](https://toml.io/en/v1.0.0).
[🐞 Bug Reports](https://git.ostiwe.com/ostiwe/go-toml/issues) [🐞 Bug Reports](https://github.com/pelletier/go-toml/issues)
[💬 Anything else](https://git.ostiwe.com/ostiwe/go-toml/discussions) [💬 Anything else](https://github.com/pelletier/go-toml/discussions)
## Documentation ## Documentation
Full API, examples, and implementation notes are available in the Go Full API, examples, and implementation notes are available in the Go
documentation. documentation.
[![Go Reference](https://pkg.go.dev/badge/git.ostiwe.com/ostiwe/go-toml/v2.svg)](https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2) [![Go Reference](https://pkg.go.dev/badge/github.com/pelletier/go-toml/v2.svg)](https://pkg.go.dev/github.com/pelletier/go-toml/v2)
## Import ## Import
```go ```go
import "git.ostiwe.com/ostiwe/go-toml/v2" import "github.com/pelletier/go-toml/v2"
``` ```
See [Modules](#Modules). See [Modules](#Modules).
@@ -41,7 +41,7 @@ operations should not be shockingly slow. See [benchmarks](#benchmarks).
the TOML document was not present in the target structure. This is a great way the TOML document was not present in the target structure. This is a great way
to check for typos. [See example in the documentation][strict]. to check for typos. [See example in the documentation][strict].
[strict]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#example-Decoder.DisallowUnknownFields [strict]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#example-Decoder.DisallowUnknownFields
### Contextualized errors ### Contextualized errors
@@ -56,7 +56,7 @@ example:
3| port = 50 3| port = 50
``` ```
[decode-err]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#DecodeError [decode-err]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#DecodeError
### Local date and time support ### Local date and time support
@@ -68,9 +68,9 @@ making them convenient yet unambiguous structures for their respective TOML
representation. representation.
[ldt]: https://toml.io/en/v1.0.0#local-date-time [ldt]: https://toml.io/en/v1.0.0#local-date-time
[tld]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#LocalDate [tld]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalDate
[tlt]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#LocalTime [tlt]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalTime
[tldt]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#LocalDateTime [tldt]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#LocalDateTime
### Commented config ### Commented config
@@ -90,7 +90,7 @@ port = 4242
# version = 'TLS 1.3' # version = 'TLS 1.3'
``` ```
[comments-example]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#example-Marshal-Commented [comments-example]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#example-Marshal-Commented
## Getting started ## Getting started
@@ -135,7 +135,7 @@ fmt.Println("tags:", cfg.Tags)
// tags: [go toml] // tags: [go toml]
``` ```
[unmarshal]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#Unmarshal [unmarshal]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Unmarshal
Here is an example using tables with some simple nesting: Here is an example using tables with some simple nesting:
@@ -217,7 +217,7 @@ fmt.Println(string(b))
// Tags = ['go', 'toml'] // Tags = ['go', 'toml']
``` ```
[marshal]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#Marshal [marshal]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Marshal
## Unstable API ## Unstable API
@@ -228,7 +228,7 @@ API subject to change.
### Parser ### Parser
Parser is the unstable API that allows iterative parsing of a TOML document at Parser is the unstable API that allows iterative parsing of a TOML document at
the AST level. See https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2/unstable. the AST level. See https://pkg.go.dev/github.com/pelletier/go-toml/v2/unstable.
## Benchmarks ## Benchmarks
@@ -281,7 +281,7 @@ Installation instructions:
- Go ≥ 1.16: Nothing to do. Use the import in your code. The `go` command deals - Go ≥ 1.16: Nothing to do. Use the import in your code. The `go` command deals
with it automatically. with it automatically.
- Go ≥ 1.13: `GO111MODULE=on go get git.ostiwe.com/ostiwe/go-toml/v2`. - Go ≥ 1.13: `GO111MODULE=on go get github.com/pelletier/go-toml/v2`.
In case of trouble: [Go Modules FAQ][mod-faq]. In case of trouble: [Go Modules FAQ][mod-faq].
@@ -294,21 +294,21 @@ Go-toml provides three handy command line tools:
* `tomljson`: Reads a TOML file and outputs its JSON representation. * `tomljson`: Reads a TOML file and outputs its JSON representation.
``` ```
$ go install git.ostiwe.com/ostiwe/go-toml/v2/cmd/tomljson@latest $ go install github.com/pelletier/go-toml/v2/cmd/tomljson@latest
$ tomljson --help $ tomljson --help
``` ```
* `jsontoml`: Reads a JSON file and outputs a TOML representation. * `jsontoml`: Reads a JSON file and outputs a TOML representation.
``` ```
$ go install git.ostiwe.com/ostiwe/go-toml/v2/cmd/jsontoml@latest $ go install github.com/pelletier/go-toml/v2/cmd/jsontoml@latest
$ jsontoml --help $ jsontoml --help
``` ```
* `tomll`: Lints and reformats a TOML file. * `tomll`: Lints and reformats a TOML file.
``` ```
$ go install git.ostiwe.com/ostiwe/go-toml/v2/cmd/tomll@latest $ go install github.com/pelletier/go-toml/v2/cmd/tomll@latest
$ tomll --help $ tomll --help
``` ```
@@ -323,7 +323,7 @@ docker run -i ghcr.io/pelletier/go-toml:v2 tomljson < example.toml
Multiple versions are available on [ghcr.io][docker]. Multiple versions are available on [ghcr.io][docker].
[docker]: https://git.ostiwe.com/ostiwe/go-toml/pkgs/container/go-toml [docker]: https://github.com/pelletier/go-toml/pkgs/container/go-toml
## Migrating from v1 ## Migrating from v1
@@ -344,7 +344,7 @@ This could impact you if you are relying on casing to differentiate two fields,
and one of them is a not using the `toml` struct tag. The recommended solution and one of them is a not using the `toml` struct tag. The recommended solution
is to be specific about tag names for those fields using the `toml` struct tag. is to be specific about tag names for those fields using the `toml` struct tag.
[v1-keys]: https://git.ostiwe.com/ostiwe/go-toml/blob/a2e52561804c6cd9392ebf0048ca64fe4af67a43/marshal.go#L775-L781 [v1-keys]: https://github.com/pelletier/go-toml/blob/a2e52561804c6cd9392ebf0048ca64fe4af67a43/marshal.go#L775-L781
#### Ignore preexisting value in interface #### Ignore preexisting value in interface
@@ -544,7 +544,7 @@ fmt.Println("v2 Encoder:\n" + string(buf.Bytes()))
// key = 'value' // key = 'value'
``` ```
[sit]: https://pkg.go.dev/git.ostiwe.com/ostiwe/go-toml/v2#Encoder.SetIndentTables [sit]: https://pkg.go.dev/github.com/pelletier/go-toml/v2#Encoder.SetIndentTables
#### Keys and strings are single quoted #### Keys and strings are single quoted
@@ -608,7 +608,7 @@ added to make the encoder behave correctly. Given backward compatibility is not
a problem anymore, v2 does the right thing by default: it follows the behavior a problem anymore, v2 does the right thing by default: it follows the behavior
of `encoding/json`. `Encoder.PromoteAnonymous` has been removed. of `encoding/json`. `Encoder.PromoteAnonymous` has been removed.
[nodoc]: https://git.ostiwe.com/ostiwe/go-toml/discussions/506#discussioncomment-1526038 [nodoc]: https://github.com/pelletier/go-toml/discussions/506#discussioncomment-1526038
### `query` ### `query`
@@ -620,7 +620,7 @@ This package has been removed because it was essentially not supported anymore
(last commit May 2020), increased the complexity of the code base, and more (last commit May 2020), increased the complexity of the code base, and more
complete solutions exist out there. complete solutions exist out there.
[query]: https://git.ostiwe.com/ostiwe/go-toml/tree/f99d6bbca119636aeafcf351ee52b3d202782627/query [query]: https://github.com/pelletier/go-toml/tree/f99d6bbca119636aeafcf351ee52b3d202782627/query
[dasel]: https://github.com/TomWright/dasel [dasel]: https://github.com/TomWright/dasel
## Versioning ## Versioning
+2 -2
View File
@@ -8,8 +8,8 @@ import (
"path/filepath" "path/filepath"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
var benchInputs = []struct { var benchInputs = []struct {
+2 -2
View File
@@ -6,8 +6,8 @@ import (
"testing" "testing"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestUnmarshalSimple(t *testing.T) { func TestUnmarshalSimple(t *testing.T) {
+5 -5
View File
@@ -117,8 +117,8 @@ coverage() {
target_diff="${output_dir}/target.diff.txt" target_diff="${output_dir}/target.diff.txt"
head_diff="${output_dir}/head.diff.txt" head_diff="${output_dir}/head.diff.txt"
cat "${target_out}" | grep -E '^git.ostiwe.com/ostiwe/go-toml' | tr -s "\t " | cut -f 2,3 | sort > "${target_diff}" cat "${target_out}" | grep -E '^github.com/pelletier/go-toml' | tr -s "\t " | cut -f 2,3 | sort > "${target_diff}"
cat "${head_out}" | grep -E '^git.ostiwe.com/ostiwe/go-toml' | tr -s "\t " | cut -f 2,3 | sort > "${head_diff}" cat "${head_out}" | grep -E '^github.com/pelletier/go-toml' | tr -s "\t " | cut -f 2,3 | sort > "${head_diff}"
diff --side-by-side --suppress-common-lines "${target_diff}" "${head_diff}" diff --side-by-side --suppress-common-lines "${target_diff}" "${head_diff}"
return 1 return 1
@@ -147,7 +147,7 @@ bench() {
pushd "$dir" pushd "$dir"
if [ "${replace}" != "" ]; then if [ "${replace}" != "" ]; then
find ./benchmark/ -iname '*.go' -exec sed -i -E "s|git.ostiwe.com/ostiwe/go-toml/v2\"|${replace}\"|g" {} \; find ./benchmark/ -iname '*.go' -exec sed -i -E "s|github.com/pelletier/go-toml/v2\"|${replace}\"|g" {} \;
go get "${replace}" go get "${replace}"
fi fi
@@ -257,9 +257,9 @@ benchmark() {
shift shift
v2stats=`fmktemp go-toml-v2` v2stats=`fmktemp go-toml-v2`
bench HEAD "${v2stats}" "git.ostiwe.com/ostiwe/go-toml/v2" bench HEAD "${v2stats}" "github.com/pelletier/go-toml/v2"
v1stats=`fmktemp go-toml-v1` v1stats=`fmktemp go-toml-v1`
bench HEAD "${v1stats}" "git.ostiwe.com/ostiwe/go-toml" bench HEAD "${v1stats}" "github.com/pelletier/go-toml"
bsstats=`fmktemp bs-toml` bsstats=`fmktemp bs-toml`
bench HEAD "${bsstats}" "github.com/BurntSushi/toml" bench HEAD "${bsstats}" "github.com/BurntSushi/toml"
+1 -1
View File
@@ -7,7 +7,7 @@ import (
"os" "os"
"path" "path"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/testsuite" "github.com/pelletier/go-toml/v2/internal/testsuite"
) )
func main() { func main() {
+1 -1
View File
@@ -7,7 +7,7 @@ import (
"os" "os"
"path" "path"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/testsuite" "github.com/pelletier/go-toml/v2/internal/testsuite"
) )
func main() { func main() {
+3 -3
View File
@@ -14,7 +14,7 @@
// //
// Using Go: // Using Go:
// //
// go install git.ostiwe.com/ostiwe/go-toml/v2/cmd/jsontoml@latest // go install github.com/pelletier/go-toml/v2/cmd/jsontoml@latest
package main package main
import ( import (
@@ -22,8 +22,8 @@ import (
"flag" "flag"
"io" "io"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/cli" "github.com/pelletier/go-toml/v2/internal/cli"
) )
const usage = `jsontoml can be used in two ways: const usage = `jsontoml can be used in two ways:
+1 -1
View File
@@ -5,7 +5,7 @@ import (
"strings" "strings"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestConvert(t *testing.T) { func TestConvert(t *testing.T) {
+3 -3
View File
@@ -14,7 +14,7 @@
// //
// Using Go: // Using Go:
// //
// go install git.ostiwe.com/ostiwe/go-toml/v2/cmd/tomljson@latest // go install github.com/pelletier/go-toml/v2/cmd/tomljson@latest
package main package main
import ( import (
@@ -23,8 +23,8 @@ import (
"fmt" "fmt"
"io" "io"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/cli" "github.com/pelletier/go-toml/v2/internal/cli"
) )
const usage = `tomljson can be used in two ways: const usage = `tomljson can be used in two ways:
+1 -1
View File
@@ -7,7 +7,7 @@ import (
"strings" "strings"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestConvert(t *testing.T) { func TestConvert(t *testing.T) {
+3 -3
View File
@@ -14,14 +14,14 @@
// //
// Using Go: // Using Go:
// //
// go install git.ostiwe.com/ostiwe/go-toml/v2/cmd/tomll@latest // go install github.com/pelletier/go-toml/v2/cmd/tomll@latest
package main package main
import ( import (
"io" "io"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/cli" "github.com/pelletier/go-toml/v2/internal/cli"
) )
const usage = `tomll can be used in two ways: const usage = `tomll can be used in two ways:
+1 -1
View File
@@ -5,7 +5,7 @@ import (
"strings" "strings"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestConvert(t *testing.T) { func TestConvert(t *testing.T) {
+1 -1
View File
@@ -3,7 +3,7 @@
// //
// Within the go-toml package, run `go generate`. Otherwise, use: // Within the go-toml package, run `go generate`. Otherwise, use:
// //
// go run git.ostiwe.com/ostiwe/go-toml/cmd/tomltestgen -o toml_testgen_test.go // go run github.com/pelletier/go-toml/cmd/tomltestgen -o toml_testgen_test.go
package main package main
import ( import (
+79 -94
View File
@@ -6,67 +6,63 @@ import (
"strconv" "strconv"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
func parseInteger(b []byte) (int64, error) { func parseInteger(b []byte, base int) (int64, error) {
if len(b) > 2 && b[0] == '0' { if len(b) > 2 && b[0] == '0' {
switch b[1] { switch b[1] {
case 'x': case 'x':
return parseIntHex(b) return parseIntHex(b, base)
case 'b': case 'b':
return parseIntBin(b) return parseIntBin(b, base)
case 'o': case 'o':
return parseIntOct(b) return parseIntOct(b, base)
default: default:
panic(fmt.Errorf("invalid base '%c', should have been checked by scanIntOrFloat", b[1])) panic(fmt.Errorf("invalid base '%c', should have been checked by scanIntOrFloat", b[1]))
} }
} }
return parseIntDec(b) return parseIntDec(b, base)
} }
func parseLocalDate(b []byte) (LocalDate, error) { func parseLocalDate(b []byte, base int) (LocalDate, error) {
// full-date = date-fullyear "-" date-month "-" date-mday
// date-fullyear = 4DIGIT
// date-month = 2DIGIT ; 01-12
// date-mday = 2DIGIT ; 01-28, 01-29, 01-30, 01-31 based on month/year
var date LocalDate var date LocalDate
if len(b) != 10 || b[4] != '-' || b[7] != '-' { if len(b) != 10 || b[4] != '-' || b[7] != '-' {
return date, unstable.NewParserError(b, "dates are expected to have the format YYYY-MM-DD") return date, unstable.NewParserError(b, base, "dates are expected to have the format YYYY-MM-DD")
} }
var err error var err error
date.Year, err = parseDecimalDigits(b[0:4]) date.Year, err = parseDecimalDigits(b[0:4], base)
if err != nil { if err != nil {
return LocalDate{}, err return LocalDate{}, err
} }
date.Month, err = parseDecimalDigits(b[5:7]) date.Month, err = parseDecimalDigits(b[5:7], base+5)
if err != nil { if err != nil {
return LocalDate{}, err return LocalDate{}, err
} }
date.Day, err = parseDecimalDigits(b[8:10]) date.Day, err = parseDecimalDigits(b[8:10], base+8)
if err != nil { if err != nil {
return LocalDate{}, err return LocalDate{}, err
} }
if !isValidDate(date.Year, date.Month, date.Day) { if !isValidDate(date.Year, date.Month, date.Day) {
return LocalDate{}, unstable.NewParserError(b, "impossible date") return LocalDate{}, unstable.NewParserError(b, base, "impossible date")
} }
return date, nil return date, nil
} }
func parseDecimalDigits(b []byte) (int, error) { func parseDecimalDigits(b []byte, base int) (int, error) {
v := 0 v := 0
for i, c := range b { for i, c := range b {
if c < '0' || c > '9' { if c < '0' || c > '9' {
return 0, unstable.NewParserError(b[i:i+1], "expected digit (0-9)") return 0, unstable.NewParserError(b[i:i+1], base+i, "expected digit (0-9)")
} }
v *= 10 v *= 10
v += int(c - '0') v += int(c - '0')
@@ -75,21 +71,18 @@ func parseDecimalDigits(b []byte) (int, error) {
return v, nil return v, nil
} }
func parseDateTime(b []byte) (time.Time, error) { func parseDateTime(b []byte, base int) (time.Time, error) {
// offset-date-time = full-date time-delim full-time origLen := len(b)
// full-time = partial-time time-offset dt, b, err := parseLocalDateTime(b, base)
// time-offset = "Z" / time-numoffset
// time-numoffset = ( "+" / "-" ) time-hour ":" time-minute
dt, b, err := parseLocalDateTime(b)
if err != nil { if err != nil {
return time.Time{}, err return time.Time{}, err
} }
tzBase := base + origLen - len(b)
var zone *time.Location var zone *time.Location
if len(b) == 0 { if len(b) == 0 {
// parser should have checked that when assigning the date time node
panic("date time should have a timezone") panic("date time should have a timezone")
} }
@@ -99,7 +92,7 @@ func parseDateTime(b []byte) (time.Time, error) {
} else { } else {
const dateTimeByteLen = 6 const dateTimeByteLen = 6
if len(b) != dateTimeByteLen { if len(b) != dateTimeByteLen {
return time.Time{}, unstable.NewParserError(b, "invalid date-time timezone") return time.Time{}, unstable.NewParserError(b, tzBase, "invalid date-time timezone")
} }
var direction int var direction int
switch b[0] { switch b[0] {
@@ -108,27 +101,27 @@ func parseDateTime(b []byte) (time.Time, error) {
case '+': case '+':
direction = +1 direction = +1
default: default:
return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset character") return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset character")
} }
if b[3] != ':' { if b[3] != ':' {
return time.Time{}, unstable.NewParserError(b[3:4], "expected a : separator") return time.Time{}, unstable.NewParserError(b[3:4], tzBase+3, "expected a : separator")
} }
hours, err := parseDecimalDigits(b[1:3]) hours, err := parseDecimalDigits(b[1:3], tzBase+1)
if err != nil { if err != nil {
return time.Time{}, err return time.Time{}, err
} }
if hours > 23 { if hours > 23 {
return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset hours") return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset hours")
} }
minutes, err := parseDecimalDigits(b[4:6]) minutes, err := parseDecimalDigits(b[4:6], tzBase+4)
if err != nil { if err != nil {
return time.Time{}, err return time.Time{}, err
} }
if minutes > 59 { if minutes > 59 {
return time.Time{}, unstable.NewParserError(b[:1], "invalid timezone offset minutes") return time.Time{}, unstable.NewParserError(b[:1], tzBase, "invalid timezone offset minutes")
} }
seconds := direction * (hours*3600 + minutes*60) seconds := direction * (hours*3600 + minutes*60)
@@ -141,7 +134,7 @@ func parseDateTime(b []byte) (time.Time, error) {
} }
if len(b) > 0 { if len(b) > 0 {
return time.Time{}, unstable.NewParserError(b, "extra bytes at the end of the timezone") return time.Time{}, unstable.NewParserError(b, tzBase, "extra bytes at the end of the timezone")
} }
t := time.Date( t := time.Date(
@@ -157,15 +150,15 @@ func parseDateTime(b []byte) (time.Time, error) {
return t, nil return t, nil
} }
func parseLocalDateTime(b []byte) (LocalDateTime, []byte, error) { func parseLocalDateTime(b []byte, base int) (LocalDateTime, []byte, error) {
var dt LocalDateTime var dt LocalDateTime
const localDateTimeByteMinLen = 11 const localDateTimeByteMinLen = 11
if len(b) < localDateTimeByteMinLen { if len(b) < localDateTimeByteMinLen {
return dt, nil, unstable.NewParserError(b, "local datetimes are expected to have the format YYYY-MM-DDTHH:MM:SS[.NNNNNNNNN]") return dt, nil, unstable.NewParserError(b, base, "local datetimes are expected to have the format YYYY-MM-DDTHH:MM:SS[.NNNNNNNNN]")
} }
date, err := parseLocalDate(b[:10]) date, err := parseLocalDate(b[:10], base)
if err != nil { if err != nil {
return dt, nil, err return dt, nil, err
} }
@@ -173,10 +166,10 @@ func parseLocalDateTime(b []byte) (LocalDateTime, []byte, error) {
sep := b[10] sep := b[10]
if sep != 'T' && sep != ' ' && sep != 't' { if sep != 'T' && sep != ' ' && sep != 't' {
return dt, nil, unstable.NewParserError(b[10:11], "datetime separator is expected to be T or a space") return dt, nil, unstable.NewParserError(b[10:11], base+10, "datetime separator is expected to be T or a space")
} }
t, rest, err := parseLocalTime(b[11:]) t, rest, err := parseLocalTime(b[11:], base+11)
if err != nil { if err != nil {
return dt, nil, err return dt, nil, err
} }
@@ -188,53 +181,53 @@ func parseLocalDateTime(b []byte) (LocalDateTime, []byte, error) {
// parseLocalTime is a bit different because it also returns the remaining // parseLocalTime is a bit different because it also returns the remaining
// []byte that is didn't need. This is to allow parseDateTime to parse those // []byte that is didn't need. This is to allow parseDateTime to parse those
// remaining bytes as a timezone. // remaining bytes as a timezone.
func parseLocalTime(b []byte) (LocalTime, []byte, error) { func parseLocalTime(b []byte, base int) (LocalTime, []byte, error) {
var ( var (
nspow = [10]int{0, 1e8, 1e7, 1e6, 1e5, 1e4, 1e3, 1e2, 1e1, 1e0} nspow = [10]int{0, 1e8, 1e7, 1e6, 1e5, 1e4, 1e3, 1e2, 1e1, 1e0}
t LocalTime t LocalTime
) )
// check if b matches to have expected format HH:MM:SS[.NNNNNN]
const localTimeByteLen = 8 const localTimeByteLen = 8
if len(b) < localTimeByteLen { if len(b) < localTimeByteLen {
return t, nil, unstable.NewParserError(b, "times are expected to have the format HH:MM:SS[.NNNNNN]") return t, nil, unstable.NewParserError(b, base, "times are expected to have the format HH:MM:SS[.NNNNNN]")
} }
var err error var err error
t.Hour, err = parseDecimalDigits(b[0:2]) t.Hour, err = parseDecimalDigits(b[0:2], base)
if err != nil { if err != nil {
return t, nil, err return t, nil, err
} }
if t.Hour > 23 { if t.Hour > 23 {
return t, nil, unstable.NewParserError(b[0:2], "hour cannot be greater 23") return t, nil, unstable.NewParserError(b[0:2], base, "hour cannot be greater 23")
} }
if b[2] != ':' { if b[2] != ':' {
return t, nil, unstable.NewParserError(b[2:3], "expecting colon between hours and minutes") return t, nil, unstable.NewParserError(b[2:3], base+2, "expecting colon between hours and minutes")
} }
t.Minute, err = parseDecimalDigits(b[3:5]) t.Minute, err = parseDecimalDigits(b[3:5], base+3)
if err != nil { if err != nil {
return t, nil, err return t, nil, err
} }
if t.Minute > 59 { if t.Minute > 59 {
return t, nil, unstable.NewParserError(b[3:5], "minutes cannot be greater 59") return t, nil, unstable.NewParserError(b[3:5], base+3, "minutes cannot be greater 59")
} }
if b[5] != ':' { if b[5] != ':' {
return t, nil, unstable.NewParserError(b[5:6], "expecting colon between minutes and seconds") return t, nil, unstable.NewParserError(b[5:6], base+5, "expecting colon between minutes and seconds")
} }
t.Second, err = parseDecimalDigits(b[6:8]) t.Second, err = parseDecimalDigits(b[6:8], base+6)
if err != nil { if err != nil {
return t, nil, err return t, nil, err
} }
if t.Second > 59 { if t.Second > 59 {
return t, nil, unstable.NewParserError(b[6:8], "seconds cannot be greater than 59") return t, nil, unstable.NewParserError(b[6:8], base+6, "seconds cannot be greater than 59")
} }
b = b[8:] b = b[8:]
base += 8
if len(b) >= 1 && b[0] == '.' { if len(b) >= 1 && b[0] == '.' {
frac := 0 frac := 0
@@ -244,7 +237,7 @@ func parseLocalTime(b []byte) (LocalTime, []byte, error) {
for i, c := range b[1:] { for i, c := range b[1:] {
if !isDigit(c) { if !isDigit(c) {
if i == 0 { if i == 0 {
return t, nil, unstable.NewParserError(b[0:1], "need at least one digit after fraction point") return t, nil, unstable.NewParserError(b[0:1], base, "need at least one digit after fraction point")
} }
break break
} }
@@ -252,13 +245,6 @@ func parseLocalTime(b []byte) (LocalTime, []byte, error) {
const maxFracPrecision = 9 const maxFracPrecision = 9
if i >= maxFracPrecision { if i >= maxFracPrecision {
// go-toml allows decoding fractional seconds
// beyond the supported precision of 9
// digits. It truncates the fractional component
// to the supported precision and ignores the
// remaining digits.
//
// https://git.ostiwe.com/ostiwe/go-toml/discussions/707
continue continue
} }
@@ -268,7 +254,7 @@ func parseLocalTime(b []byte) (LocalTime, []byte, error) {
} }
if precision == 0 { if precision == 0 {
return t, nil, unstable.NewParserError(b[:1], "nanoseconds need at least one digit") return t, nil, unstable.NewParserError(b[:1], base, "nanoseconds need at least one digit")
} }
t.Nanosecond = frac * nspow[precision] t.Nanosecond = frac * nspow[precision]
@@ -279,35 +265,35 @@ func parseLocalTime(b []byte) (LocalTime, []byte, error) {
return t, b, nil return t, b, nil
} }
func parseFloat(b []byte) (float64, error) { func parseFloat(b []byte, base int) (float64, error) {
if len(b) == 4 && (b[0] == '+' || b[0] == '-') && b[1] == 'n' && b[2] == 'a' && b[3] == 'n' { if len(b) == 4 && (b[0] == '+' || b[0] == '-') && b[1] == 'n' && b[2] == 'a' && b[3] == 'n' {
return math.NaN(), nil return math.NaN(), nil
} }
cleaned, err := checkAndRemoveUnderscoresFloats(b) cleaned, err := checkAndRemoveUnderscoresFloats(b, base)
if err != nil { if err != nil {
return 0, err return 0, err
} }
if cleaned[0] == '.' { if cleaned[0] == '.' {
return 0, unstable.NewParserError(b, "float cannot start with a dot") return 0, unstable.NewParserError(b, base, "float cannot start with a dot")
} }
if cleaned[len(cleaned)-1] == '.' { if cleaned[len(cleaned)-1] == '.' {
return 0, unstable.NewParserError(b, "float cannot end with a dot") return 0, unstable.NewParserError(b, base, "float cannot end with a dot")
} }
dotAlreadySeen := false dotAlreadySeen := false
for i, c := range cleaned { for i, c := range cleaned {
if c == '.' { if c == '.' {
if dotAlreadySeen { if dotAlreadySeen {
return 0, unstable.NewParserError(b[i:i+1], "float can have at most one decimal point") return 0, unstable.NewParserError(b[i:i+1], base+i, "float can have at most one decimal point")
} }
if !isDigit(cleaned[i-1]) { if !isDigit(cleaned[i-1]) {
return 0, unstable.NewParserError(b[i-1:i+1], "float decimal point must be preceded by a digit") return 0, unstable.NewParserError(b[i-1:i+1], base+i-1, "float decimal point must be preceded by a digit")
} }
if !isDigit(cleaned[i+1]) { if !isDigit(cleaned[i+1]) {
return 0, unstable.NewParserError(b[i:i+2], "float decimal point must be followed by a digit") return 0, unstable.NewParserError(b[i:i+2], base+i, "float decimal point must be followed by a digit")
} }
dotAlreadySeen = true dotAlreadySeen = true
} }
@@ -318,54 +304,54 @@ func parseFloat(b []byte) (float64, error) {
start = 1 start = 1
} }
if cleaned[start] == '0' && len(cleaned) > start+1 && isDigit(cleaned[start+1]) { if cleaned[start] == '0' && len(cleaned) > start+1 && isDigit(cleaned[start+1]) {
return 0, unstable.NewParserError(b, "float integer part cannot have leading zeroes") return 0, unstable.NewParserError(b, base, "float integer part cannot have leading zeroes")
} }
f, err := strconv.ParseFloat(string(cleaned), 64) f, err := strconv.ParseFloat(string(cleaned), 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, "unable to parse float: %w", err) return 0, unstable.NewParserError(b, base, "unable to parse float: %w", err)
} }
return f, nil return f, nil
} }
func parseIntHex(b []byte) (int64, error) { func parseIntHex(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:]) cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2)
if err != nil { if err != nil {
return 0, err return 0, err
} }
i, err := strconv.ParseInt(string(cleaned), 16, 64) i, err := strconv.ParseInt(string(cleaned), 16, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, "couldn't parse hexadecimal number: %w", err) return 0, unstable.NewParserError(b, base, "couldn't parse hexadecimal number: %w", err)
} }
return i, nil return i, nil
} }
func parseIntOct(b []byte) (int64, error) { func parseIntOct(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:]) cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2)
if err != nil { if err != nil {
return 0, err return 0, err
} }
i, err := strconv.ParseInt(string(cleaned), 8, 64) i, err := strconv.ParseInt(string(cleaned), 8, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, "couldn't parse octal number: %w", err) return 0, unstable.NewParserError(b, base, "couldn't parse octal number: %w", err)
} }
return i, nil return i, nil
} }
func parseIntBin(b []byte) (int64, error) { func parseIntBin(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:]) cleaned, err := checkAndRemoveUnderscoresIntegers(b[2:], base+2)
if err != nil { if err != nil {
return 0, err return 0, err
} }
i, err := strconv.ParseInt(string(cleaned), 2, 64) i, err := strconv.ParseInt(string(cleaned), 2, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, "couldn't parse binary number: %w", err) return 0, unstable.NewParserError(b, base, "couldn't parse binary number: %w", err)
} }
return i, nil return i, nil
@@ -375,8 +361,8 @@ func isSign(b byte) bool {
return b == '+' || b == '-' return b == '+' || b == '-'
} }
func parseIntDec(b []byte) (int64, error) { func parseIntDec(b []byte, base int) (int64, error) {
cleaned, err := checkAndRemoveUnderscoresIntegers(b) cleaned, err := checkAndRemoveUnderscoresIntegers(b, base)
if err != nil { if err != nil {
return 0, err return 0, err
} }
@@ -388,18 +374,18 @@ func parseIntDec(b []byte) (int64, error) {
} }
if len(cleaned) > startIdx+1 && cleaned[startIdx] == '0' { if len(cleaned) > startIdx+1 && cleaned[startIdx] == '0' {
return 0, unstable.NewParserError(b, "leading zero not allowed on decimal number") return 0, unstable.NewParserError(b, base, "leading zero not allowed on decimal number")
} }
i, err := strconv.ParseInt(string(cleaned), 10, 64) i, err := strconv.ParseInt(string(cleaned), 10, 64)
if err != nil { if err != nil {
return 0, unstable.NewParserError(b, "couldn't parse decimal number: %w", err) return 0, unstable.NewParserError(b, base, "couldn't parse decimal number: %w", err)
} }
return i, nil return i, nil
} }
func checkAndRemoveUnderscoresIntegers(b []byte) ([]byte, error) { func checkAndRemoveUnderscoresIntegers(b []byte, base int) ([]byte, error) {
start := 0 start := 0
if b[start] == '+' || b[start] == '-' { if b[start] == '+' || b[start] == '-' {
start++ start++
@@ -410,11 +396,11 @@ func checkAndRemoveUnderscoresIntegers(b []byte) ([]byte, error) {
} }
if b[start] == '_' { if b[start] == '_' {
return nil, unstable.NewParserError(b[start:start+1], "number cannot start with underscore") return nil, unstable.NewParserError(b[start:start+1], base+start, "number cannot start with underscore")
} }
if b[len(b)-1] == '_' { if b[len(b)-1] == '_' {
return nil, unstable.NewParserError(b[len(b)-1:], "number cannot end with underscore") return nil, unstable.NewParserError(b[len(b)-1:], base+len(b)-1, "number cannot end with underscore")
} }
// fast path // fast path
@@ -436,7 +422,7 @@ func checkAndRemoveUnderscoresIntegers(b []byte) ([]byte, error) {
c := b[i] c := b[i]
if c == '_' { if c == '_' {
if !before { if !before {
return nil, unstable.NewParserError(b[i-1:i+1], "number must have at least one digit between underscores") return nil, unstable.NewParserError(b[i-1:i+1], base+i-1, "number must have at least one digit between underscores")
} }
before = false before = false
} else { } else {
@@ -448,13 +434,13 @@ func checkAndRemoveUnderscoresIntegers(b []byte) ([]byte, error) {
return cleaned, nil return cleaned, nil
} }
func checkAndRemoveUnderscoresFloats(b []byte) ([]byte, error) { func checkAndRemoveUnderscoresFloats(b []byte, base int) ([]byte, error) {
if b[0] == '_' { if b[0] == '_' {
return nil, unstable.NewParserError(b[0:1], "number cannot start with underscore") return nil, unstable.NewParserError(b[0:1], base, "number cannot start with underscore")
} }
if b[len(b)-1] == '_' { if b[len(b)-1] == '_' {
return nil, unstable.NewParserError(b[len(b)-1:], "number cannot end with underscore") return nil, unstable.NewParserError(b[len(b)-1:], base+len(b)-1, "number cannot end with underscore")
} }
// fast path // fast path
@@ -477,27 +463,26 @@ func checkAndRemoveUnderscoresFloats(b []byte) ([]byte, error) {
switch c { switch c {
case '_': case '_':
if !before { if !before {
return nil, unstable.NewParserError(b[i-1:i+1], "number must have at least one digit between underscores") return nil, unstable.NewParserError(b[i-1:i+1], base+i-1, "number must have at least one digit between underscores")
} }
if i < len(b)-1 && (b[i+1] == 'e' || b[i+1] == 'E') { if i < len(b)-1 && (b[i+1] == 'e' || b[i+1] == 'E') {
return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore before exponent") return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore before exponent")
} }
before = false before = false
case '+', '-': case '+', '-':
// signed exponents
cleaned = append(cleaned, c) cleaned = append(cleaned, c)
before = false before = false
case 'e', 'E': case 'e', 'E':
if i < len(b)-1 && b[i+1] == '_' { if i < len(b)-1 && b[i+1] == '_' {
return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore after exponent") return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore after exponent")
} }
cleaned = append(cleaned, c) cleaned = append(cleaned, c)
case '.': case '.':
if i < len(b)-1 && b[i+1] == '_' { if i < len(b)-1 && b[i+1] == '_' {
return nil, unstable.NewParserError(b[i+1:i+2], "cannot have underscore after decimal point") return nil, unstable.NewParserError(b[i+1:i+2], base+i+1, "cannot have underscore after decimal point")
} }
if i > 0 && b[i-1] == '_' { if i > 0 && b[i-1] == '_' {
return nil, unstable.NewParserError(b[i-1:i], "cannot have underscore before decimal point") return nil, unstable.NewParserError(b[i-1:i], base+i-1, "cannot have underscore before decimal point")
} }
cleaned = append(cleaned, c) cleaned = append(cleaned, c)
default: default:
+2 -22
View File
@@ -2,11 +2,10 @@ package toml
import ( import (
"fmt" "fmt"
"reflect"
"strconv" "strconv"
"strings" "strings"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
// DecodeError represents an error encountered during the parsing or decoding // DecodeError represents an error encountered during the parsing or decoding
@@ -100,7 +99,7 @@ func (e *DecodeError) Key() Key {
// //
//nolint:funlen //nolint:funlen
func wrapDecodeError(document []byte, de *unstable.ParserError) *DecodeError { func wrapDecodeError(document []byte, de *unstable.ParserError) *DecodeError {
offset := subsliceOffset(document, de.Highlight) offset := de.Offset
errMessage := de.Error() errMessage := de.Error()
errLine, errColumn := positionAtEnd(document[:offset]) errLine, errColumn := positionAtEnd(document[:offset])
@@ -262,22 +261,3 @@ func positionAtEnd(b []byte) (row int, column int) {
return row, column return row, column
} }
// subsliceOffset returns the byte offset of subslice within data.
// subslice must share the same backing array as data.
func subsliceOffset(data []byte, subslice []byte) int {
if len(subslice) == 0 {
return 0
}
// Use reflect to get the data pointers of both slices.
// This is safe because we're only reading the pointer values for comparison.
dataPtr := reflect.ValueOf(data).Pointer()
subPtr := reflect.ValueOf(subslice).Pointer()
offset := int(subPtr - dataPtr)
if offset < 0 || offset > len(data) {
panic("subslice is not within data")
}
return offset
}
+97 -2
View File
@@ -7,8 +7,8 @@ import (
"strings" "strings"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
//nolint:funlen //nolint:funlen
@@ -171,6 +171,7 @@ line 5`,
err := wrapDecodeError(doc, &unstable.ParserError{ err := wrapDecodeError(doc, &unstable.ParserError{
Highlight: hl, Highlight: hl,
Offset: start,
Message: e.msg, Message: e.msg,
}) })
@@ -286,6 +287,100 @@ func TestDecodeError_Position(t *testing.T) {
} }
} }
func TestDecodeError_PositionAfterComments(t *testing.T) {
examples := []struct {
name string
doc string
expectedRow int
expectedCol int
errContains string
}{
{
name: "invalid key after comment",
doc: "# comment\n= \"value\"",
expectedRow: 2,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key after multiple comments",
doc: "# line 1\n# line 2\n= \"value\"",
expectedRow: 3,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key after valid assignment and comment",
doc: "a = 1\n# comment\n= \"value\"",
expectedRow: 3,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key on first line",
doc: "= \"value\"",
expectedRow: 1,
expectedCol: 1,
errContains: "invalid character at start of key",
},
{
name: "invalid key with leading whitespace",
doc: "# comment\n = \"value\"",
expectedRow: 2,
expectedCol: 3,
errContains: "invalid character at start of key",
},
}
for _, e := range examples {
t.Run(e.name, func(t *testing.T) {
var v map[string]interface{}
err := Unmarshal([]byte(e.doc), &v)
if err == nil {
t.Fatal("expected an error")
}
var derr *DecodeError
if !errors.As(err, &derr) {
t.Fatalf("expected DecodeError, got %T: %v", err, err)
}
row, col := derr.Position()
if row != e.expectedRow {
t.Errorf("row: got %d, want %d (error: %s)", row, e.expectedRow, derr.String())
}
if col != e.expectedCol {
t.Errorf("col: got %d, want %d (error: %s)", col, e.expectedCol, derr.String())
}
if !strings.Contains(derr.Error(), e.errContains) {
t.Errorf("error %q does not contain %q", derr.Error(), e.errContains)
}
})
}
}
func TestDecodeError_HumanStringAfterComments(t *testing.T) {
doc := "# comment\n= \"value\""
var v map[string]interface{}
err := Unmarshal([]byte(doc), &v)
if err == nil {
t.Fatal("expected an error")
}
var derr *DecodeError
if !errors.As(err, &derr) {
t.Fatalf("expected DecodeError, got %T: %v", err, err)
}
human := derr.String()
if !strings.Contains(human, "= \"value\"") {
t.Errorf("human-readable error should show the offending line, got:\n%s", human)
}
if !strings.Contains(human, "2|") {
t.Errorf("human-readable error should reference line 2, got:\n%s", human)
}
}
func TestStrictErrorUnwrap(t *testing.T) { func TestStrictErrorUnwrap(t *testing.T) {
fo := bytes.NewBufferString(` fo := bytes.NewBufferString(`
Missing = 1 Missing = 1
+1 -1
View File
@@ -5,7 +5,7 @@ import (
"log" "log"
"strconv" "strconv"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
) )
type customInt int type customInt int
+2 -2
View File
@@ -3,8 +3,8 @@ package toml_test
import ( import (
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestFastSimpleInt(t *testing.T) { func TestFastSimpleInt(t *testing.T) {
+2 -2
View File
@@ -5,8 +5,8 @@ import (
"strings" "strings"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func FuzzUnmarshal(f *testing.F) { func FuzzUnmarshal(f *testing.F) {
+1 -1
View File
@@ -1,3 +1,3 @@
module git.ostiwe.com/ostiwe/go-toml/v2 module github.com/pelletier/go-toml/v2
go 1.21.0 go 1.21.0
+12 -16
View File
@@ -24,61 +24,57 @@ import (
// 0x9 => tab, ok // 0x9 => tab, ok
// 0xA - 0x1F => invalid // 0xA - 0x1F => invalid
// 0x7F => invalid // 0x7F => invalid
func Utf8TomlValidAlreadyEscaped(p []byte) []byte { func Utf8TomlValidAlreadyEscaped(p []byte) int {
consumed := 0
// Fast path. Check for and skip 8 bytes of ASCII characters per iteration. // Fast path. Check for and skip 8 bytes of ASCII characters per iteration.
for len(p) >= 8 { for len(p) >= 8 {
// Combining two 32 bit loads allows the same code to be used
// for 32 and 64 bit platforms.
// The compiler can generate a 32bit load for first32 and second32
// on many platforms. See test/codegen/memcombine.go.
first32 := uint32(p[0]) | uint32(p[1])<<8 | uint32(p[2])<<16 | uint32(p[3])<<24 first32 := uint32(p[0]) | uint32(p[1])<<8 | uint32(p[2])<<16 | uint32(p[3])<<24
second32 := uint32(p[4]) | uint32(p[5])<<8 | uint32(p[6])<<16 | uint32(p[7])<<24 second32 := uint32(p[4]) | uint32(p[5])<<8 | uint32(p[6])<<16 | uint32(p[7])<<24
if (first32|second32)&0x80808080 != 0 { if (first32|second32)&0x80808080 != 0 {
// Found a non ASCII byte (>= RuneSelf).
break break
} }
for i, b := range p[:8] { for i, b := range p[:8] {
if InvalidASCII(b) { if InvalidASCII(b) {
return p[i : i+1] return consumed + i
} }
} }
p = p[8:] p = p[8:]
consumed += 8
} }
n := len(p) n := len(p)
for i := 0; i < n; { for i := 0; i < n; {
pi := p[i] pi := p[i]
if pi < utf8.RuneSelf { if pi < utf8.RuneSelf {
if InvalidASCII(pi) { if InvalidASCII(pi) {
return p[i : i+1] return consumed + i
} }
i++ i++
continue continue
} }
x := first[pi] x := first[pi]
if x == xx { if x == xx {
// Illegal starter byte. return consumed + i
return p[i : i+1]
} }
size := int(x & 7) size := int(x & 7)
if i+size > n { if i+size > n {
// Short or invalid. return consumed + i
return p[i:n]
} }
accept := acceptRanges[x>>4] accept := acceptRanges[x>>4]
if c := p[i+1]; c < accept.lo || accept.hi < c { if c := p[i+1]; c < accept.lo || accept.hi < c {
return p[i : i+2] return consumed + i
} else if size == 2 { //revive:disable:empty-block } else if size == 2 { //revive:disable:empty-block
} else if c := p[i+2]; c < locb || hicb < c { } else if c := p[i+2]; c < locb || hicb < c {
return p[i : i+3] return consumed + i
} else if size == 3 { //revive:disable:empty-block } else if size == 3 { //revive:disable:empty-block
} else if c := p[i+3]; c < locb || hicb < c { } else if c := p[i+3]; c < locb || hicb < c {
return p[i : i+4] return consumed + i
} }
i += size i += size
} }
return nil return -1
} }
// Utf8ValidNext returns the size of the next rune if valid, 0 otherwise. // Utf8ValidNext returns the size of the next rune if valid, 0 otherwise.
+1 -1
View File
@@ -9,7 +9,7 @@ import (
"io" "io"
"os" "os"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
) )
type ConvertFn func(r io.Reader, w io.Writer) error type ConvertFn func(r io.Reader, w io.Writer) error
+2 -2
View File
@@ -9,8 +9,8 @@ import (
"strings" "strings"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func processMain(args []string, input io.Reader, stdout, stderr io.Writer, f ConvertFn) int { func processMain(args []string, input io.Reader, stdout, stderr io.Writer, f ConvertFn) int {
@@ -8,8 +8,8 @@ import (
"testing" "testing"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestDocMarshal(t *testing.T) { func TestDocMarshal(t *testing.T) {
@@ -15,8 +15,8 @@ import (
"testing" "testing"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
type basicMarshalTestStruct struct { type basicMarshalTestStruct struct {
+1 -1
View File
@@ -6,7 +6,7 @@ import (
"strconv" "strconv"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
) )
// addTag adds JSON tags to a data structure as expected by toml-test. // addTag adds JSON tags to a data structure as expected by toml-test.
+1 -1
View File
@@ -5,7 +5,7 @@ import (
"strconv" "strconv"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
) )
// Remove JSON tags to a data structure as returned by toml-test. // Remove JSON tags to a data structure as returned by toml-test.
+1 -1
View File
@@ -7,7 +7,7 @@ import (
"fmt" "fmt"
"os" "os"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
) )
// Marshal is a helper function for calling toml.Marshal // Marshal is a helper function for calling toml.Marshal
+1 -1
View File
@@ -1,6 +1,6 @@
package tracker package tracker
import "git.ostiwe.com/ostiwe/go-toml/v2/unstable" import "github.com/pelletier/go-toml/v2/unstable"
// KeyTracker is a tracker that keeps track of the current Key as the AST is // KeyTracker is a tracker that keeps track of the current Key as the AST is
// walked. // walked.
+1 -1
View File
@@ -5,7 +5,7 @@ import (
"fmt" "fmt"
"sync" "sync"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
type keyKind uint8 type keyKind uint8
+1 -1
View File
@@ -4,7 +4,7 @@ import (
"reflect" "reflect"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestEntrySize(t *testing.T) { func TestEntrySize(t *testing.T) {
+6 -6
View File
@@ -5,7 +5,7 @@ import (
"strings" "strings"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
// LocalDate represents a calendar day in no specific timezone. // LocalDate represents a calendar day in no specific timezone.
@@ -32,7 +32,7 @@ func (d LocalDate) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d. // UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalDate) UnmarshalText(b []byte) error { func (d *LocalDate) UnmarshalText(b []byte) error {
res, err := parseLocalDate(b) res, err := parseLocalDate(b, 0)
if err != nil { if err != nil {
return err return err
} }
@@ -75,9 +75,9 @@ func (d LocalTime) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d. // UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalTime) UnmarshalText(b []byte) error { func (d *LocalTime) UnmarshalText(b []byte) error {
res, left, err := parseLocalTime(b) res, left, err := parseLocalTime(b, 0)
if err == nil && len(left) != 0 { if err == nil && len(left) != 0 {
err = unstable.NewParserError(left, "extra characters") err = unstable.NewParserError(left, len(b)-len(left), "extra characters")
} }
if err != nil { if err != nil {
return err return err
@@ -109,9 +109,9 @@ func (d LocalDateTime) MarshalText() ([]byte, error) {
// UnmarshalText parses b using RFC 3339 to fill d. // UnmarshalText parses b using RFC 3339 to fill d.
func (d *LocalDateTime) UnmarshalText(data []byte) error { func (d *LocalDateTime) UnmarshalText(data []byte) error {
res, left, err := parseLocalDateTime(data) res, left, err := parseLocalDateTime(data, 0)
if err == nil && len(left) != 0 { if err == nil && len(left) != 0 {
err = unstable.NewParserError(left, "extra characters") err = unstable.NewParserError(left, len(data)-len(left), "extra characters")
} }
if err != nil { if err != nil {
return err return err
+2 -2
View File
@@ -4,8 +4,8 @@ import (
"testing" "testing"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestLocalDate_AsTime(t *testing.T) { func TestLocalDate_AsTime(t *testing.T) {
+3 -3
View File
@@ -15,7 +15,7 @@ import (
"time" "time"
"unicode" "unicode"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/characters" "github.com/pelletier/go-toml/v2/internal/characters"
) )
// Marshal serializes a Go value as a TOML document. // Marshal serializes a Go value as a TOML document.
@@ -1110,7 +1110,7 @@ func (enc *Encoder) encodeSliceAsArrayTable(b []byte, ctx encoderCtx, v reflect.
scratch = enc.indent(ctx.indent, scratch) scratch = enc.indent(ctx.indent, scratch)
} }
scratch = append(scratch, "["...) scratch = append(scratch, "[["...)
for i, k := range ctx.parentKey { for i, k := range ctx.parentKey {
if i > 0 { if i > 0 {
@@ -1120,7 +1120,7 @@ func (enc *Encoder) encodeSliceAsArrayTable(b []byte, ctx encoderCtx, v reflect.
scratch = enc.encodeKey(scratch, k) scratch = enc.encodeKey(scratch, k)
} }
scratch = append(scratch, "]\n"...) scratch = append(scratch, "]]\n"...)
ctx.skipTableHeader = true ctx.skipTableHeader = true
b = enc.encodeComment(ctx.indent, ctx.options.comment, b) b = enc.encodeComment(ctx.indent, ctx.options.comment, b)
+3 -3
View File
@@ -13,8 +13,8 @@ import (
"testing" "testing"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
type marshalTextKey struct { type marshalTextKey struct {
@@ -2220,7 +2220,7 @@ port = 4242
// TestMarshalIssue975 tests that nil pointer values in maps are marshaled as // TestMarshalIssue975 tests that nil pointer values in maps are marshaled as
// empty tables, allowing round-trip marshaling to work correctly. // empty tables, allowing round-trip marshaling to work correctly.
// See https://git.ostiwe.com/ostiwe/go-toml/issues/975 // See https://github.com/pelletier/go-toml/issues/975
func TestMarshalIssue975(t *testing.T) { func TestMarshalIssue975(t *testing.T) {
// Test case from the issue: map[string]*struct{} // Test case from the issue: map[string]*struct{}
oldMap := map[string]*struct{}{ oldMap := map[string]*struct{}{
+1 -1
View File
@@ -6,7 +6,7 @@ import (
"reflect" "reflect"
"strings" "strings"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
) )
// FuzzToml is the fuzzing target. // FuzzToml is the fuzzing target.
+10 -8
View File
@@ -1,8 +1,8 @@
package toml package toml
import ( import (
"git.ostiwe.com/ostiwe/go-toml/v2/internal/tracker" "github.com/pelletier/go-toml/v2/internal/tracker"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
type strict struct { type strict struct {
@@ -54,10 +54,12 @@ func (s *strict) MissingTable(node *unstable.Node) {
return return
} }
loc, offset := s.keyLocation(node)
s.missing = append(s.missing, unstable.ParserError{ s.missing = append(s.missing, unstable.ParserError{
Highlight: s.keyLocation(node), Highlight: loc,
Message: "missing table", Message: "missing table",
Key: s.key.Key(), Key: s.key.Key(),
Offset: offset,
}) })
} }
@@ -66,10 +68,12 @@ func (s *strict) MissingField(node *unstable.Node) {
return return
} }
loc, offset := s.keyLocation(node)
s.missing = append(s.missing, unstable.ParserError{ s.missing = append(s.missing, unstable.ParserError{
Highlight: s.keyLocation(node), Highlight: loc,
Message: "missing field", Message: "missing field",
Key: s.key.Key(), Key: s.key.Key(),
Offset: offset,
}) })
} }
@@ -90,7 +94,7 @@ func (s *strict) Error(doc []byte) error {
return err return err
} }
func (s *strict) keyLocation(node *unstable.Node) []byte { func (s *strict) keyLocation(node *unstable.Node) ([]byte, int) {
k := node.Key() k := node.Key()
hasOne := k.Next() hasOne := k.Next()
@@ -98,7 +102,6 @@ func (s *strict) keyLocation(node *unstable.Node) []byte {
panic("should not be called with empty key") panic("should not be called with empty key")
} }
// Get the range from the first key to the last key.
firstRaw := k.Node().Raw firstRaw := k.Node().Raw
lastRaw := firstRaw lastRaw := firstRaw
@@ -106,9 +109,8 @@ func (s *strict) keyLocation(node *unstable.Node) []byte {
lastRaw = k.Node().Raw lastRaw = k.Node().Raw
} }
// Compute the slice from the document using the ranges.
start := firstRaw.Offset start := firstRaw.Offset
end := lastRaw.Offset + lastRaw.Length end := lastRaw.Offset + lastRaw.Length
return s.doc[start:end] return s.doc[start:end], int(start)
} }
+3 -3
View File
@@ -8,9 +8,9 @@ import (
"errors" "errors"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/testsuite" "github.com/pelletier/go-toml/v2/internal/testsuite"
) )
func testgenInvalid(t *testing.T, input string) { func testgenInvalid(t *testing.T, input string) {
+22 -22
View File
@@ -12,8 +12,8 @@ import (
"sync/atomic" "sync/atomic"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/tracker" "github.com/pelletier/go-toml/v2/internal/tracker"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
// Unmarshal deserializes a TOML document into a Go value. // Unmarshal deserializes a TOML document into a Go value.
@@ -625,7 +625,7 @@ func (d *decoder) handleTable(key unstable.Iterator, v reflect.Value) (reflect.V
} }
} }
} }
return reflect.Value{}, unstable.NewParserError(key.Node().Data, "cannot store a table in a slice") return reflect.Value{}, unstable.NewParserError(key.Node().Data, int(key.Node().Raw.Offset), "cannot store a table in a slice")
} }
if key.Next() { if key.Next() {
// Still scoping the key // Still scoping the key
@@ -748,7 +748,7 @@ func (d *decoder) tryTextUnmarshaler(node *unstable.Node, v reflect.Value) (bool
if v.CanAddr() && v.Addr().Type().Implements(textUnmarshalerType) { if v.CanAddr() && v.Addr().Type().Implements(textUnmarshalerType) {
err := v.Addr().Interface().(encoding.TextUnmarshaler).UnmarshalText(node.Data) err := v.Addr().Interface().(encoding.TextUnmarshaler).UnmarshalText(node.Data)
if err != nil { if err != nil {
return false, unstable.NewParserError(d.p.Raw(node.Raw), "%w", err) return false, unstable.NewParserError(d.p.Raw(node.Raw), int(node.Raw.Offset), "%w", err)
} }
return true, nil return true, nil
@@ -896,7 +896,7 @@ func (d *decoder) unmarshalInlineTable(itable *unstable.Node, v reflect.Value) e
} }
return d.unmarshalInlineTable(itable, elem) return d.unmarshalInlineTable(itable, elem)
default: default:
return unstable.NewParserError(d.p.Raw(itable.Raw), "cannot store inline table in Go type %s", v.Kind()) return unstable.NewParserError(d.p.Raw(itable.Raw), int(itable.Raw.Offset), "cannot store inline table in Go type %s", v.Kind())
} }
it := itable.Children() it := itable.Children()
@@ -916,26 +916,26 @@ func (d *decoder) unmarshalInlineTable(itable *unstable.Node, v reflect.Value) e
} }
func (d *decoder) unmarshalDateTime(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalDateTime(value *unstable.Node, v reflect.Value) error {
dt, err := parseDateTime(value.Data) dt, err := parseDateTime(value.Data, int(value.Raw.Offset))
if err != nil { if err != nil {
return err return err
} }
if v.Kind() != reflect.Interface && v.Type() != timeType { if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("datetime", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("datetime", v.Type()))
} }
v.Set(reflect.ValueOf(dt)) v.Set(reflect.ValueOf(dt))
return nil return nil
} }
func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) error {
ld, err := parseLocalDate(value.Data) ld, err := parseLocalDate(value.Data, int(value.Raw.Offset))
if err != nil { if err != nil {
return err return err
} }
if v.Kind() != reflect.Interface && v.Type() != timeType { if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local date", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local date", v.Type()))
} }
if v.Type() == timeType { if v.Type() == timeType {
v.Set(reflect.ValueOf(ld.AsTime(time.Local))) v.Set(reflect.ValueOf(ld.AsTime(time.Local)))
@@ -946,34 +946,34 @@ func (d *decoder) unmarshalLocalDate(value *unstable.Node, v reflect.Value) erro
} }
func (d *decoder) unmarshalLocalTime(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalLocalTime(value *unstable.Node, v reflect.Value) error {
lt, rest, err := parseLocalTime(value.Data) lt, rest, err := parseLocalTime(value.Data, int(value.Raw.Offset))
if err != nil { if err != nil {
return err return err
} }
if len(rest) > 0 { if len(rest) > 0 {
return unstable.NewParserError(rest, "extra characters at the end of a local time") return unstable.NewParserError(rest, int(value.Raw.Offset)+len(value.Data)-len(rest), "extra characters at the end of a local time")
} }
if v.Kind() != reflect.Interface { if v.Kind() != reflect.Interface {
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local time", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local time", v.Type()))
} }
v.Set(reflect.ValueOf(lt)) v.Set(reflect.ValueOf(lt))
return nil return nil
} }
func (d *decoder) unmarshalLocalDateTime(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalLocalDateTime(value *unstable.Node, v reflect.Value) error {
ldt, rest, err := parseLocalDateTime(value.Data) ldt, rest, err := parseLocalDateTime(value.Data, int(value.Raw.Offset))
if err != nil { if err != nil {
return err return err
} }
if len(rest) > 0 { if len(rest) > 0 {
return unstable.NewParserError(rest, "extra characters at the end of a local date time") return unstable.NewParserError(rest, int(value.Raw.Offset)+len(value.Data)-len(rest), "extra characters at the end of a local date time")
} }
if v.Kind() != reflect.Interface && v.Type() != timeType { if v.Kind() != reflect.Interface && v.Type() != timeType {
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("local datetime", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("local datetime", v.Type()))
} }
if v.Type() == timeType { if v.Type() == timeType {
v.Set(reflect.ValueOf(ldt.AsTime(time.Local))) v.Set(reflect.ValueOf(ldt.AsTime(time.Local)))
@@ -992,14 +992,14 @@ func (d *decoder) unmarshalBool(value *unstable.Node, v reflect.Value) error {
case reflect.Interface: case reflect.Interface:
v.Set(reflect.ValueOf(b)) v.Set(reflect.ValueOf(b))
default: default:
return unstable.NewParserError(value.Data, "cannot assign boolean to a %t", b) return unstable.NewParserError(value.Data, int(value.Raw.Offset), "cannot assign boolean to a %t", b)
} }
return nil return nil
} }
func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error { func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error {
f, err := parseFloat(value.Data) f, err := parseFloat(value.Data, int(value.Raw.Offset))
if err != nil { if err != nil {
return err return err
} }
@@ -1009,13 +1009,13 @@ func (d *decoder) unmarshalFloat(value *unstable.Node, v reflect.Value) error {
v.SetFloat(f) v.SetFloat(f)
case reflect.Float32: case reflect.Float32:
if f > math.MaxFloat32 { if f > math.MaxFloat32 {
return unstable.NewParserError(value.Data, "number %f does not fit in a float32", f) return unstable.NewParserError(value.Data, int(value.Raw.Offset), "number %f does not fit in a float32", f)
} }
v.SetFloat(f) v.SetFloat(f)
case reflect.Interface: case reflect.Interface:
v.Set(reflect.ValueOf(f)) v.Set(reflect.ValueOf(f))
default: default:
return unstable.NewParserError(value.Data, "float cannot be assigned to %s", v.Kind()) return unstable.NewParserError(value.Data, int(value.Raw.Offset), "float cannot be assigned to %s", v.Kind())
} }
return nil return nil
@@ -1048,7 +1048,7 @@ func (d *decoder) unmarshalInteger(value *unstable.Node, v reflect.Value) error
return d.unmarshalFloat(value, v) return d.unmarshalFloat(value, v)
} }
i, err := parseInteger(value.Data) i, err := parseInteger(value.Data, int(value.Raw.Offset))
if err != nil { if err != nil {
return err return err
} }
@@ -1116,7 +1116,7 @@ func (d *decoder) unmarshalInteger(value *unstable.Node, v reflect.Value) error
case reflect.Interface: case reflect.Interface:
r = reflect.ValueOf(i) r = reflect.ValueOf(i)
default: default:
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("integer", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("integer", v.Type()))
} }
if !r.Type().AssignableTo(v.Type()) { if !r.Type().AssignableTo(v.Type()) {
@@ -1135,7 +1135,7 @@ func (d *decoder) unmarshalString(value *unstable.Node, v reflect.Value) error {
case reflect.Interface: case reflect.Interface:
v.Set(reflect.ValueOf(string(value.Data))) v.Set(reflect.ValueOf(string(value.Data)))
default: default:
return unstable.NewParserError(d.p.Raw(value.Raw), "%s", d.typeMismatchString("string", v.Type())) return unstable.NewParserError(d.p.Raw(value.Raw), int(value.Raw.Offset), "%s", d.typeMismatchString("string", v.Type()))
} }
return nil return nil
+3 -3
View File
@@ -11,9 +11,9 @@ import (
"testing" "testing"
"time" "time"
"git.ostiwe.com/ostiwe/go-toml/v2" "github.com/pelletier/go-toml/v2"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
"git.ostiwe.com/ostiwe/go-toml/v2/unstable" "github.com/pelletier/go-toml/v2/unstable"
) )
type unmarshalTextKey struct { type unmarshalTextKey struct {
+1 -1
View File
@@ -35,7 +35,7 @@ func BenchmarkScanComments(b *testing.B) {
b.ResetTimer() b.ResetTimer()
for i := 0; i < b.N; i++ { for i := 0; i < b.N; i++ {
_, _, _ = scanComment(input) _, _, _ = scanComment(input, 0)
} }
}) })
} }
+59 -70
View File
@@ -5,7 +5,7 @@ import (
"fmt" "fmt"
"unicode" "unicode"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/characters" "github.com/pelletier/go-toml/v2/internal/characters"
) )
// ParserError describes an error relative to the content of the document. // ParserError describes an error relative to the content of the document.
@@ -16,6 +16,7 @@ type ParserError struct {
Highlight []byte Highlight []byte
Message string Message string
Key []string // optional Key []string // optional
Offset int
} }
// Error is the implementation of the error interface. // Error is the implementation of the error interface.
@@ -27,9 +28,10 @@ func (e *ParserError) Error() string {
// //
// Warning: Highlight needs to be a subslice of Parser.data, so only slices // Warning: Highlight needs to be a subslice of Parser.data, so only slices
// returned by Parser.Raw are valid candidates. // returned by Parser.Raw are valid candidates.
func NewParserError(highlight []byte, format string, args ...interface{}) error { func NewParserError(highlight []byte, offset int, format string, args ...interface{}) error {
return &ParserError{ return &ParserError{
Highlight: highlight, Highlight: highlight,
Offset: offset,
Message: fmt.Errorf(format, args...).Error(), Message: fmt.Errorf(format, args...).Error(),
} }
} }
@@ -64,14 +66,8 @@ func (p *Parser) Data() []byte {
return p.data return p.data
} }
// Range returns a range description that corresponds to a given slice of the func (p *Parser) offsetOf(b []byte) int {
// input. If the argument is not a subslice of the parser input, this function return len(p.data) - len(b)
// panics.
func (p *Parser) Range(b []byte) Range {
return Range{
Offset: uint32(p.subsliceOffset(b)), //nolint:gosec // TOML documents are small
Length: uint32(len(b)), //nolint:gosec // TOML documents are small
}
} }
// rangeOfToken computes the Range of a token given the remaining bytes after the token. // rangeOfToken computes the Range of a token given the remaining bytes after the token.
@@ -82,13 +78,6 @@ func (p *Parser) rangeOfToken(token, rest []byte) Range {
return Range{Offset: uint32(offset), Length: uint32(len(token))} //nolint:gosec // TOML documents are small return Range{Offset: uint32(offset), Length: uint32(len(token))} //nolint:gosec // TOML documents are small
} }
// subsliceOffset returns the byte offset of subslice b within p.data.
// b must be a suffix (tail) of p.data.
func (p *Parser) subsliceOffset(b []byte) int {
// b is a suffix of p.data, so its offset is len(p.data) - len(b)
return len(p.data) - len(b)
}
// Raw returns the slice corresponding to the bytes in the given range. // Raw returns the slice corresponding to the bytes in the given range.
func (p *Parser) Raw(raw Range) []byte { func (p *Parser) Raw(raw Range) []byte {
return p.data[raw.Offset : raw.Offset+raw.Length] return p.data[raw.Offset : raw.Offset+raw.Length]
@@ -198,16 +187,16 @@ func (p *Parser) parseNewline(b []byte) ([]byte, error) {
} }
if b[0] == '\r' { if b[0] == '\r' {
_, rest, err := scanWindowsNewline(b) _, rest, err := scanWindowsNewline(b, p.offsetOf(b))
return rest, err return rest, err
} }
return nil, NewParserError(b[0:1], "expected newline but got %#U", b[0]) return nil, NewParserError(b[0:1], p.offsetOf(b), "expected newline but got %#U", b[0])
} }
func (p *Parser) parseComment(b []byte) (reference, []byte, error) { func (p *Parser) parseComment(b []byte) (reference, []byte, error) {
ref := invalidReference ref := invalidReference
data, rest, err := scanComment(b) data, rest, err := scanComment(b, p.offsetOf(b))
if p.KeepComments && err == nil { if p.KeepComments && err == nil {
ref = p.builder.Push(Node{ ref = p.builder.Push(Node{
Kind: Comment, Kind: Comment,
@@ -291,12 +280,12 @@ func (p *Parser) parseArrayTable(b []byte) (reference, []byte, error) {
p.builder.AttachChild(ref, k) p.builder.AttachChild(ref, k)
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
b, err = expect(']', b) b, err = expect(']', b, p.offsetOf(b))
if err != nil { if err != nil {
return ref, nil, err return ref, nil, err
} }
b, err = expect(']', b) b, err = expect(']', b, p.offsetOf(b))
return ref, b, err return ref, b, err
} }
@@ -321,7 +310,7 @@ func (p *Parser) parseStdTable(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
b, err = expect(']', b) b, err = expect(']', b, p.offsetOf(b))
return ref, b, err return ref, b, err
} }
@@ -345,10 +334,10 @@ func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
if len(b) == 0 { if len(b) == 0 {
return invalidReference, nil, NewParserError(startB[:len(startB)-len(b)], "expected = after a key, but the document ends there") return invalidReference, nil, NewParserError(startB[:len(startB)-len(b)], p.offsetOf(startB), "expected = after a key, but the document ends there")
} }
b, err = expect('=', b) b, err = expect('=', b, p.offsetOf(b))
if err != nil { if err != nil {
return invalidReference, nil, err return invalidReference, nil, err
} }
@@ -377,7 +366,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
ref := invalidReference ref := invalidReference
if len(b) == 0 { if len(b) == 0 {
return ref, nil, NewParserError(b, "expected value, not eof") return ref, nil, NewParserError(b, p.offsetOf(b), "expected value, not eof")
} }
var err error var err error
@@ -422,7 +411,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
return ref, b, err return ref, b, err
case 't': case 't':
if !scanFollowsTrue(b) { if !scanFollowsTrue(b) {
return ref, nil, NewParserError(atmost(b, 4), "expected 'true'") return ref, nil, NewParserError(atmost(b, 4), p.offsetOf(b), "expected 'true'")
} }
ref = p.builder.Push(Node{ ref = p.builder.Push(Node{
@@ -433,7 +422,7 @@ func (p *Parser) parseVal(b []byte) (reference, []byte, error) {
return ref, b[4:], nil return ref, b[4:], nil
case 'f': case 'f':
if !scanFollowsFalse(b) { if !scanFollowsFalse(b) {
return ref, nil, NewParserError(atmost(b, 5), "expected 'false'") return ref, nil, NewParserError(atmost(b, 5), p.offsetOf(b), "expected 'false'")
} }
ref = p.builder.Push(Node{ ref = p.builder.Push(Node{
@@ -460,7 +449,7 @@ func atmost(b []byte, n int) []byte {
} }
func (p *Parser) parseLiteralString(b []byte) ([]byte, []byte, []byte, error) { func (p *Parser) parseLiteralString(b []byte) ([]byte, []byte, []byte, error) {
v, rest, err := scanLiteralString(b) v, rest, err := scanLiteralString(b, p.offsetOf(b))
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -492,7 +481,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b) b = p.parseWhitespace(b)
if len(b) == 0 { if len(b) == 0 {
return parent, nil, NewParserError(previousB[:1], "inline table is incomplete") return parent, nil, NewParserError(previousB[:1], p.offsetOf(previousB), "inline table is incomplete")
} }
if b[0] == '}' { if b[0] == '}' {
@@ -500,7 +489,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
} }
if !first { if !first {
b, err = expect(',', b) b, err = expect(',', b, p.offsetOf(b))
if err != nil { if err != nil {
return parent, nil, err return parent, nil, err
} }
@@ -524,7 +513,7 @@ func (p *Parser) parseInlineTable(b []byte) (reference, []byte, error) {
first = false first = false
} }
rest, err := expect('}', b) rest, err := expect('}', b, p.offsetOf(b))
return parent, rest, err return parent, rest, err
} }
@@ -573,7 +562,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
} }
if len(b) == 0 { if len(b) == 0 {
return parent, nil, NewParserError(arrayStart[:1], "array is incomplete") return parent, nil, NewParserError(arrayStart[:1], p.offsetOf(arrayStart), "array is incomplete")
} }
if b[0] == ']' { if b[0] == ']' {
@@ -582,7 +571,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
if b[0] == ',' { if b[0] == ',' {
if first { if first {
return parent, nil, NewParserError(b[0:1], "array cannot start with comma") return parent, nil, NewParserError(b[0:1], p.offsetOf(b), "array cannot start with comma")
} }
b = b[1:] b = b[1:]
@@ -594,7 +583,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
addChild(cref) addChild(cref)
} }
} else if !first { } else if !first {
return parent, nil, NewParserError(b[0:1], "array elements must be separated by commas") return parent, nil, NewParserError(b[0:1], p.offsetOf(b), "array elements must be separated by commas")
} }
// TOML allows trailing commas in arrays. // TOML allows trailing commas in arrays.
@@ -621,7 +610,7 @@ func (p *Parser) parseValArray(b []byte) (reference, []byte, error) {
first = false first = false
} }
rest, err := expect(']', b) rest, err := expect(']', b, p.offsetOf(b))
return parent, rest, err return parent, rest, err
} }
@@ -676,7 +665,7 @@ func (p *Parser) parseOptionalWhitespaceCommentNewline(b []byte) (reference, []b
} }
func (p *Parser) parseMultilineLiteralString(b []byte) ([]byte, []byte, []byte, error) { func (p *Parser) parseMultilineLiteralString(b []byte) ([]byte, []byte, []byte, error) {
token, rest, err := scanMultilineLiteralString(b) token, rest, err := scanMultilineLiteralString(b, p.offsetOf(b))
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -705,7 +694,7 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
// mlb-quotes = 1*2quotation-mark // mlb-quotes = 1*2quotation-mark
// mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii // mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// mlb-escaped-nl = escape ws newline *( wschar / newline ) // mlb-escaped-nl = escape ws newline *( wschar / newline )
token, escaped, rest, err := scanMultilineBasicString(b) token, escaped, rest, err := scanMultilineBasicString(b, p.offsetOf(b))
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -722,14 +711,15 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
// fast path // fast path
startIdx := i startIdx := i
endIdx := len(token) - len(`"""`) endIdx := len(token) - len(`"""`)
tokenBase := p.offsetOf(token)
if !escaped { if !escaped {
str := token[startIdx:endIdx] str := token[startIdx:endIdx]
highlight := characters.Utf8TomlValidAlreadyEscaped(str) invalidIdx := characters.Utf8TomlValidAlreadyEscaped(str)
if len(highlight) == 0 { if invalidIdx < 0 {
return token, str, rest, nil return token, str, rest, nil
} }
return nil, nil, nil, NewParserError(highlight, "invalid UTF-8") return nil, nil, nil, NewParserError(str[invalidIdx:invalidIdx+1], tokenBase+startIdx+invalidIdx, "invalid UTF-8")
} }
var builder bytes.Buffer var builder bytes.Buffer
@@ -794,14 +784,14 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
case 'e': case 'e':
builder.WriteByte(0x1B) builder.WriteByte(0x1B)
case 'u': case 'u':
x, err := hexToRune(atmost(token[i+1:], 4), 4) x, err := hexToRune(atmost(token[i+1:], 4), tokenBase+i+1, 4)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
builder.WriteRune(x) builder.WriteRune(x)
i += 4 i += 4
case 'U': case 'U':
x, err := hexToRune(atmost(token[i+1:], 8), 8) x, err := hexToRune(atmost(token[i+1:], 8), tokenBase+i+1, 8)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -809,13 +799,13 @@ func (p *Parser) parseMultilineBasicString(b []byte) ([]byte, []byte, []byte, er
builder.WriteRune(x) builder.WriteRune(x)
i += 8 i += 8
default: default:
return nil, nil, nil, NewParserError(token[i:i+1], "invalid escaped character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid escaped character %#U", c)
} }
i++ i++
} else { } else {
size := characters.Utf8ValidNext(token[i:]) size := characters.Utf8ValidNext(token[i:])
if size == 0 { if size == 0 {
return nil, nil, nil, NewParserError(token[i:i+1], "invalid character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid character %#U", c)
} }
builder.Write(token[i : i+size]) builder.Write(token[i : i+size])
i += size i += size
@@ -870,12 +860,9 @@ func (p *Parser) parseKey(b []byte) (reference, []byte, error) {
func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) { func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) {
if len(b) == 0 { if len(b) == 0 {
return nil, nil, nil, NewParserError(b, "expected key but found none") return nil, nil, nil, NewParserError(b, p.offsetOf(b), "expected key but found none")
} }
// simple-key = quoted-key / unquoted-key
// unquoted-key = 1*( ALPHA / DIGIT / %x2D / %x5F ) ; A-Z / a-z / 0-9 / - / _
// quoted-key = basic-string / literal-string
switch { switch {
case b[0] == '\'': case b[0] == '\'':
return p.parseLiteralString(b) return p.parseLiteralString(b)
@@ -885,7 +872,7 @@ func (p *Parser) parseSimpleKey(b []byte) (raw, key, rest []byte, err error) {
key, rest = scanUnquotedKey(b) key, rest = scanUnquotedKey(b)
return key, key, rest, nil return key, key, rest, nil
default: default:
return nil, nil, nil, NewParserError(b[0:1], "invalid character at start of key: %c", b[0]) return nil, nil, nil, NewParserError(b[0:1], p.offsetOf(b), "invalid character at start of key: %c", b[0])
} }
} }
@@ -905,7 +892,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
// escape-seq-char =/ %x74 ; t tab U+0009 // escape-seq-char =/ %x74 ; t tab U+0009
// escape-seq-char =/ %x75 4HEXDIG ; uXXXX U+XXXX // escape-seq-char =/ %x75 4HEXDIG ; uXXXX U+XXXX
// escape-seq-char =/ %x55 8HEXDIG ; UXXXXXXXX U+XXXXXXXX // escape-seq-char =/ %x55 8HEXDIG ; UXXXXXXXX U+XXXXXXXX
token, escaped, rest, err := scanBasicString(b) token, escaped, rest, err := scanBasicString(b, p.offsetOf(b))
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -916,13 +903,15 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
// Fast path. If there is no escape sequence, the string should just be // Fast path. If there is no escape sequence, the string should just be
// an UTF-8 encoded string, which is the same as Go. In that case, // an UTF-8 encoded string, which is the same as Go. In that case,
// validate the string and return a direct reference to the buffer. // validate the string and return a direct reference to the buffer.
tokenBase := p.offsetOf(token)
if !escaped { if !escaped {
str := token[startIdx:endIdx] str := token[startIdx:endIdx]
highlight := characters.Utf8TomlValidAlreadyEscaped(str) invalidIdx := characters.Utf8TomlValidAlreadyEscaped(str)
if len(highlight) == 0 { if invalidIdx < 0 {
return token, str, rest, nil return token, str, rest, nil
} }
return nil, nil, nil, NewParserError(highlight, "invalid UTF-8") return nil, nil, nil, NewParserError(str[invalidIdx:invalidIdx+1], tokenBase+startIdx+invalidIdx, "invalid UTF-8")
} }
i := startIdx i := startIdx
@@ -953,7 +942,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
case 'e': case 'e':
builder.WriteByte(0x1B) builder.WriteByte(0x1B)
case 'u': case 'u':
x, err := hexToRune(token[i+1:len(token)-1], 4) x, err := hexToRune(token[i+1:len(token)-1], tokenBase+i+1, 4)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -961,7 +950,7 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
builder.WriteRune(x) builder.WriteRune(x)
i += 4 i += 4
case 'U': case 'U':
x, err := hexToRune(token[i+1:len(token)-1], 8) x, err := hexToRune(token[i+1:len(token)-1], tokenBase+i+1, 8)
if err != nil { if err != nil {
return nil, nil, nil, err return nil, nil, nil, err
} }
@@ -969,13 +958,13 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
builder.WriteRune(x) builder.WriteRune(x)
i += 8 i += 8
default: default:
return nil, nil, nil, NewParserError(token[i:i+1], "invalid escaped character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid escaped character %#U", c)
} }
i++ i++
} else { } else {
size := characters.Utf8ValidNext(token[i:]) size := characters.Utf8ValidNext(token[i:])
if size == 0 { if size == 0 {
return nil, nil, nil, NewParserError(token[i:i+1], "invalid character %#U", c) return nil, nil, nil, NewParserError(token[i:i+1], tokenBase+i, "invalid character %#U", c)
} }
builder.Write(token[i : i+size]) builder.Write(token[i : i+size])
i += size i += size
@@ -985,9 +974,9 @@ func (p *Parser) parseBasicString(b []byte) ([]byte, []byte, []byte, error) {
return token, builder.Bytes(), rest, nil return token, builder.Bytes(), rest, nil
} }
func hexToRune(b []byte, length int) (rune, error) { func hexToRune(b []byte, base int, length int) (rune, error) {
if len(b) < length { if len(b) < length {
return -1, NewParserError(b, "unicode point needs %d character, not %d", length, len(b)) return -1, NewParserError(b, base, "unicode point needs %d character, not %d", length, len(b))
} }
b = b[:length] b = b[:length]
@@ -1002,13 +991,13 @@ func hexToRune(b []byte, length int) (rune, error) {
case 'A' <= c && c <= 'F': case 'A' <= c && c <= 'F':
d = uint32(c - 'A' + 10) d = uint32(c - 'A' + 10)
default: default:
return -1, NewParserError(b[i:i+1], "non-hex character") return -1, NewParserError(b[i:i+1], base+i, "non-hex character")
} }
r = r*16 + d r = r*16 + d
} }
if r > unicode.MaxRune || 0xD800 <= r && r < 0xE000 { if r > unicode.MaxRune || 0xD800 <= r && r < 0xE000 {
return -1, NewParserError(b, "escape sequence is invalid Unicode code point") return -1, NewParserError(b, base, "escape sequence is invalid Unicode code point")
} }
return rune(r), nil return rune(r), nil
@@ -1028,7 +1017,7 @@ func (p *Parser) parseIntOrFloatOrDateTime(b []byte) (reference, []byte, error)
switch b[0] { switch b[0] {
case 'i': case 'i':
if !scanFollowsInf(b) { if !scanFollowsInf(b) {
return invalidReference, nil, NewParserError(atmost(b, 3), "expected 'inf'") return invalidReference, nil, NewParserError(atmost(b, 3), p.offsetOf(b), "expected 'inf'")
} }
return p.builder.Push(Node{ return p.builder.Push(Node{
@@ -1038,7 +1027,7 @@ func (p *Parser) parseIntOrFloatOrDateTime(b []byte) (reference, []byte, error)
}), b[3:], nil }), b[3:], nil
case 'n': case 'n':
if !scanFollowsNan(b) { if !scanFollowsNan(b) {
return invalidReference, nil, NewParserError(atmost(b, 3), "expected 'nan'") return invalidReference, nil, NewParserError(atmost(b, 3), p.offsetOf(b), "expected 'nan'")
} }
return p.builder.Push(Node{ return p.builder.Push(Node{
@@ -1197,7 +1186,7 @@ func (p *Parser) scanIntOrFloat(b []byte) (reference, []byte, error) {
}), b[i+3:], nil }), b[i+3:], nil
} }
return invalidReference, nil, NewParserError(b[i:i+1], "unexpected character 'i' while scanning for a number") return invalidReference, nil, NewParserError(b[i:i+1], p.offsetOf(b)+i, "unexpected character 'i' while scanning for a number")
} }
if c == 'n' { if c == 'n' {
@@ -1209,14 +1198,14 @@ func (p *Parser) scanIntOrFloat(b []byte) (reference, []byte, error) {
}), b[i+3:], nil }), b[i+3:], nil
} }
return invalidReference, nil, NewParserError(b[i:i+1], "unexpected character 'n' while scanning for a number") return invalidReference, nil, NewParserError(b[i:i+1], p.offsetOf(b)+i, "unexpected character 'n' while scanning for a number")
} }
break break
} }
if i == 0 { if i == 0 {
return invalidReference, b, NewParserError(b, "incomplete number") return invalidReference, b, NewParserError(b, p.offsetOf(b), "incomplete number")
} }
kind := Integer kind := Integer
@@ -1253,13 +1242,13 @@ func isValidBinaryRune(r byte) bool {
return r == '0' || r == '1' || r == '_' return r == '0' || r == '1' || r == '_'
} }
func expect(x byte, b []byte) ([]byte, error) { func expect(x byte, b []byte, base int) ([]byte, error) {
if len(b) == 0 { if len(b) == 0 {
return nil, NewParserError(b, "expected character %c but the document ended here", x) return nil, NewParserError(b, base, "expected character %c but the document ended here", x)
} }
if b[0] != x { if b[0] != x {
return nil, NewParserError(b[0:1], "expected character %c", x) return nil, NewParserError(b[0:1], base, "expected character %c", x)
} }
return b[1:], nil return b[1:], nil
+92 -1
View File
@@ -1,12 +1,13 @@
package unstable package unstable
import ( import (
"errors"
"fmt" "fmt"
"strconv" "strconv"
"strings" "strings"
"testing" "testing"
"git.ostiwe.com/ostiwe/go-toml/v2/internal/assert" "github.com/pelletier/go-toml/v2/internal/assert"
) )
func TestParser_AST_Numbers(t *testing.T) { func TestParser_AST_Numbers(t *testing.T) {
@@ -673,6 +674,96 @@ key3 = "value3"
assert.Equal(t, []string{"key1", "key2", "key3"}, keys) assert.Equal(t, []string{"key1", "key2", "key3"}, keys)
} }
func TestErrorOffsetAfterComment(t *testing.T) {
input := []byte("# comment\n= \"value\"")
p := Parser{}
p.Reset(input)
for p.NextExpression() {
}
err := p.Error()
if err == nil {
t.Fatal("expected an error")
}
var perr *ParserError
if !errors.As(err, &perr) {
t.Fatalf("expected ParserError, got %T", err)
}
if perr.Offset != 10 {
t.Errorf("offset: got %d, want 10", perr.Offset)
}
shape := p.Shape(Range{Offset: uint32(perr.Offset), Length: uint32(len(perr.Highlight))})
if shape.Start.Line != 2 || shape.Start.Column != 1 {
t.Errorf("position: got %d:%d, want 2:1", shape.Start.Line, shape.Start.Column)
}
}
func TestErrorHighlightPositions(t *testing.T) {
examples := []struct {
desc string
input string
wantLine int
wantColumn int
}{
{
desc: "invalid key start after comment",
input: "# comment\n= \"value\"",
wantLine: 2,
wantColumn: 1,
},
{
desc: "invalid key start on first line",
input: "= \"value\"",
wantLine: 1,
wantColumn: 1,
},
{
desc: "invalid key after multiple comments",
input: "# comment 1\n# comment 2\n= \"value\"",
wantLine: 3,
wantColumn: 1,
},
{
desc: "invalid key after valid key-value",
input: "a = 1\n= \"value\"",
wantLine: 2,
wantColumn: 1,
},
{
desc: "invalid key after whitespace on line",
input: "a = 1\n = \"value\"",
wantLine: 2,
wantColumn: 3,
},
}
for _, e := range examples {
t.Run(e.desc, func(t *testing.T) {
p := Parser{}
p.Reset([]byte(e.input))
for p.NextExpression() {
}
err := p.Error()
if err == nil {
t.Fatal("expected an error")
}
var perr *ParserError
if !errors.As(err, &perr) {
t.Fatalf("expected ParserError, got %T", err)
}
shape := p.Shape(Range{Offset: uint32(perr.Offset), Length: uint32(len(perr.Highlight))})
if shape.Start.Line != e.wantLine {
t.Errorf("line: got %d, want %d", shape.Start.Line, e.wantLine)
}
if shape.Start.Column != e.wantColumn {
t.Errorf("column: got %d, want %d", shape.Start.Column, e.wantColumn)
}
})
}
}
func ExampleParser() { func ExampleParser() {
doc := ` doc := `
hello = "world" hello = "world"
+28 -73
View File
@@ -1,6 +1,6 @@
package unstable package unstable
import "git.ostiwe.com/ostiwe/go-toml/v2/internal/characters" import "github.com/pelletier/go-toml/v2/internal/characters"
func scanFollows(b []byte, pattern string) bool { func scanFollows(b []byte, pattern string) bool {
n := len(pattern) n := len(pattern)
@@ -47,48 +47,31 @@ func isUnquotedKeyChar(r byte) bool {
return (r >= 'A' && r <= 'Z') || (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9') || r == '-' || r == '_' return (r >= 'A' && r <= 'Z') || (r >= 'a' && r <= 'z') || (r >= '0' && r <= '9') || r == '-' || r == '_'
} }
func scanLiteralString(b []byte) ([]byte, []byte, error) { func scanLiteralString(b []byte, base int) ([]byte, []byte, error) {
// literal-string = apostrophe *literal-char apostrophe
// apostrophe = %x27 ; ' apostrophe
// literal-char = %x09 / %x20-26 / %x28-7E / non-ascii
for i := 1; i < len(b); { for i := 1; i < len(b); {
switch b[i] { switch b[i] {
case '\'': case '\'':
return b[:i+1], b[i+1:], nil return b[:i+1], b[i+1:], nil
case '\n', '\r': case '\n', '\r':
return nil, nil, NewParserError(b[i:i+1], "literal strings cannot have new lines") return nil, nil, NewParserError(b[i:i+1], base+i, "literal strings cannot have new lines")
} }
size := characters.Utf8ValidNext(b[i:]) size := characters.Utf8ValidNext(b[i:])
if size == 0 { if size == 0 {
return nil, nil, NewParserError(b[i:i+1], "invalid character") return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character")
} }
i += size i += size
} }
return nil, nil, NewParserError(b[len(b):], "unterminated literal string") return nil, nil, NewParserError(b[len(b):], base+len(b), "unterminated literal string")
} }
func scanMultilineLiteralString(b []byte) ([]byte, []byte, error) { func scanMultilineLiteralString(b []byte, base int) ([]byte, []byte, error) {
// ml-literal-string = ml-literal-string-delim [ newline ] ml-literal-body
// ml-literal-string-delim
// ml-literal-string-delim = 3apostrophe
// ml-literal-body = *mll-content *( mll-quotes 1*mll-content ) [ mll-quotes ]
//
// mll-content = mll-char / newline
// mll-char = %x09 / %x20-26 / %x28-7E / non-ascii
// mll-quotes = 1*2apostrophe
for i := 3; i < len(b); { for i := 3; i < len(b); {
switch b[i] { switch b[i] {
case '\'': case '\'':
if scanFollowsMultilineLiteralStringDelimiter(b[i:]) { if scanFollowsMultilineLiteralStringDelimiter(b[i:]) {
i += 3 i += 3
// At that point we found 3 apostrophe, and i is the
// index of the byte after the third one. The scanner
// needs to be eager, because there can be an extra 2
// apostrophe that can be accepted at the end of the
// string.
if i >= len(b) || b[i] != '\'' { if i >= len(b) || b[i] != '\'' {
return b[:i], b[i:], nil return b[:i], b[i:], nil
} }
@@ -100,39 +83,39 @@ func scanMultilineLiteralString(b []byte) ([]byte, []byte, error) {
i++ i++
if i < len(b) && b[i] == '\'' { if i < len(b) && b[i] == '\'' {
return nil, nil, NewParserError(b[i-3:i+1], "''' not allowed in multiline literal string") return nil, nil, NewParserError(b[i-3:i+1], base+i-3, "''' not allowed in multiline literal string")
} }
return b[:i], b[i:], nil return b[:i], b[i:], nil
} }
case '\r': case '\r':
if len(b) < i+2 { if len(b) < i+2 {
return nil, nil, NewParserError(b[len(b):], `need a \n after \r`) return nil, nil, NewParserError(b[len(b):], base+len(b), `need a \n after \r`)
} }
if b[i+1] != '\n' { if b[i+1] != '\n' {
return nil, nil, NewParserError(b[i:i+2], `need a \n after \r`) return nil, nil, NewParserError(b[i:i+2], base+i, `need a \n after \r`)
} }
i += 2 // skip the \n i += 2
continue continue
} }
size := characters.Utf8ValidNext(b[i:]) size := characters.Utf8ValidNext(b[i:])
if size == 0 { if size == 0 {
return nil, nil, NewParserError(b[i:i+1], "invalid character") return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character")
} }
i += size i += size
} }
return nil, nil, NewParserError(b[len(b):], `multiline literal string not terminated by '''`) return nil, nil, NewParserError(b[len(b):], base+len(b), `multiline literal string not terminated by '''`)
} }
func scanWindowsNewline(b []byte) ([]byte, []byte, error) { func scanWindowsNewline(b []byte, base int) ([]byte, []byte, error) {
const lenCRLF = 2 const lenCRLF = 2
if len(b) < lenCRLF { if len(b) < lenCRLF {
return nil, nil, NewParserError(b, "windows new line expected") return nil, nil, NewParserError(b, base, "windows new line expected")
} }
if b[1] != '\n' { if b[1] != '\n' {
return nil, nil, NewParserError(b, `windows new line should be \r\n`) return nil, nil, NewParserError(b, base, `windows new line should be \r\n`)
} }
return b[:lenCRLF], b[lenCRLF:], nil return b[:lenCRLF], b[lenCRLF:], nil
@@ -151,13 +134,7 @@ func scanWhitespace(b []byte) ([]byte, []byte) {
return b, b[len(b):] return b, b[len(b):]
} }
func scanComment(b []byte) ([]byte, []byte, error) { func scanComment(b []byte, base int) ([]byte, []byte, error) {
// comment-start-symbol = %x23 ; #
// non-ascii = %x80-D7FF / %xE000-10FFFF
// non-eol = %x09 / %x20-7F / non-ascii
//
// comment = comment-start-symbol *non-eol
for i := 1; i < len(b); { for i := 1; i < len(b); {
if b[i] == '\n' { if b[i] == '\n' {
return b[:i], b[i:], nil return b[:i], b[i:], nil
@@ -166,11 +143,11 @@ func scanComment(b []byte) ([]byte, []byte, error) {
if i+1 < len(b) && b[i+1] == '\n' { if i+1 < len(b) && b[i+1] == '\n' {
return b[:i+1], b[i+1:], nil return b[:i+1], b[i+1:], nil
} }
return nil, nil, NewParserError(b[i:i+1], "invalid character in comment") return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character in comment")
} }
size := characters.Utf8ValidNext(b[i:]) size := characters.Utf8ValidNext(b[i:])
if size == 0 { if size == 0 {
return nil, nil, NewParserError(b[i:i+1], "invalid character in comment") return nil, nil, NewParserError(b[i:i+1], base+i, "invalid character in comment")
} }
i += size i += size
@@ -179,12 +156,7 @@ func scanComment(b []byte) ([]byte, []byte, error) {
return b, b[len(b):], nil return b, b[len(b):], nil
} }
func scanBasicString(b []byte) ([]byte, bool, []byte, error) { func scanBasicString(b []byte, base int) ([]byte, bool, []byte, error) {
// basic-string = quotation-mark *basic-char quotation-mark
// quotation-mark = %x22 ; "
// basic-char = basic-unescaped / escaped
// basic-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// escaped = escape escape-seq-char
escaped := false escaped := false
i := 1 i := 1
@@ -193,31 +165,20 @@ func scanBasicString(b []byte) ([]byte, bool, []byte, error) {
case '"': case '"':
return b[:i+1], escaped, b[i+1:], nil return b[:i+1], escaped, b[i+1:], nil
case '\n', '\r': case '\n', '\r':
return nil, escaped, nil, NewParserError(b[i:i+1], "basic strings cannot have new lines") return nil, escaped, nil, NewParserError(b[i:i+1], base+i, "basic strings cannot have new lines")
case '\\': case '\\':
if len(b) < i+2 { if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[i:i+1], "need a character after \\") return nil, escaped, nil, NewParserError(b[i:i+1], base+i, "need a character after \\")
} }
escaped = true escaped = true
i++ // skip the next character i++ // skip the next character
} }
} }
return nil, escaped, nil, NewParserError(b[len(b):], `basic string not terminated by "`) return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `basic string not terminated by "`)
} }
func scanMultilineBasicString(b []byte) ([]byte, bool, []byte, error) { func scanMultilineBasicString(b []byte, base int) ([]byte, bool, []byte, error) {
// ml-basic-string = ml-basic-string-delim [ newline ] ml-basic-body
// ml-basic-string-delim
// ml-basic-string-delim = 3quotation-mark
// ml-basic-body = *mlb-content *( mlb-quotes 1*mlb-content ) [ mlb-quotes ]
//
// mlb-content = mlb-char / newline / mlb-escaped-nl
// mlb-char = mlb-unescaped / escaped
// mlb-quotes = 1*2quotation-mark
// mlb-unescaped = wschar / %x21 / %x23-5B / %x5D-7E / non-ascii
// mlb-escaped-nl = escape ws newline *( wschar / newline )
escaped := false escaped := false
i := 3 i := 3
@@ -227,12 +188,6 @@ func scanMultilineBasicString(b []byte) ([]byte, bool, []byte, error) {
if scanFollowsMultilineBasicStringDelimiter(b[i:]) { if scanFollowsMultilineBasicStringDelimiter(b[i:]) {
i += 3 i += 3
// At that point we found 3 apostrophe, and i is the
// index of the byte after the third one. The scanner
// needs to be eager, because there can be an extra 2
// apostrophe that can be accepted at the end of the
// string.
if i >= len(b) || b[i] != '"' { if i >= len(b) || b[i] != '"' {
return b[:i], escaped, b[i:], nil return b[:i], escaped, b[i:], nil
} }
@@ -244,27 +199,27 @@ func scanMultilineBasicString(b []byte) ([]byte, bool, []byte, error) {
i++ i++
if i < len(b) && b[i] == '"' { if i < len(b) && b[i] == '"' {
return nil, escaped, nil, NewParserError(b[i-3:i+1], `""" not allowed in multiline basic string`) return nil, escaped, nil, NewParserError(b[i-3:i+1], base+i-3, `""" not allowed in multiline basic string`)
} }
return b[:i], escaped, b[i:], nil return b[:i], escaped, b[i:], nil
} }
case '\\': case '\\':
if len(b) < i+2 { if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[len(b):], "need a character after \\") return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), "need a character after \\")
} }
escaped = true escaped = true
i++ // skip the next character i++ // skip the next character
case '\r': case '\r':
if len(b) < i+2 { if len(b) < i+2 {
return nil, escaped, nil, NewParserError(b[len(b):], `need a \n after \r`) return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `need a \n after \r`)
} }
if b[i+1] != '\n' { if b[i+1] != '\n' {
return nil, escaped, nil, NewParserError(b[i:i+2], `need a \n after \r`) return nil, escaped, nil, NewParserError(b[i:i+2], base+i, `need a \n after \r`)
} }
i++ // skip the \n i++ // skip the \n
} }
} }
return nil, escaped, nil, NewParserError(b[len(b):], `multiline basic string not terminated by """`) return nil, escaped, nil, NewParserError(b[len(b):], base+len(b), `multiline basic string not terminated by """`)
} }