Compare commits

...

10 Commits

Author SHA1 Message Date
Cursor Agent d15f1d131f fix parser error highlight offsets for non-suffix slices
Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 12:26:40 +00:00
Thomas Pelletier f36a3ece9e Reduce marshal and unmarshal overhead (#1044)
* Reduce marshal and unmarshal overhead

Targeted optimizations to reduce performance overhead introduced by
recent feature additions and the unsafe removal.

Unmarshal:
- parseKeyval: access the node directly in the builder's slice to set
  Raw, bypassing NodeAt which triggers a GC write barrier for the
  nodes-pointer on every key-value expression.
- Iterator.Next: cache the *nodes slice dereference in a local variable
  to avoid repeated pointer-to-slice indirection in the hot loop.

Marshal:
- Guard shouldOmitZero calls with an inlineable options.omitzero check.
  shouldOmitZero has inlining cost 1145 (budget 80), so avoiding the
  function call when omitzero is not set removes per-field overhead.
- Inline the isNil check in encodeMap. isNil has inlining cost 93
  (budget 80), so expanding it at the single hot call site avoids
  per-map-entry function call overhead.

Update README benchmarks.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:08:39 +00:00
Thomas Pelletier 77f3862df4 Fix benchmark script replacing internal package imports (#1042)
* Fix benchmark script replacing internal package imports

The sed command in bench() was replacing all occurrences of the go-toml
module path, including sub-package imports like internal/assert. This
caused the BurntSushi/toml benchmark to fail because it tried to import
github.com/BurntSushi/toml/internal/assert which doesn't exist.

Fix by anchoring the sed pattern to only match the import path when
followed by a closing quote, preserving internal package imports.

Also add a guard in the benchstathtml Python script to give a clear
error instead of an IndexError when no benchmark results are available.

https://claude.ai/code/session_016JGASo49PeFSfCaDxvrGFE

* Update benchmark results in README

https://claude.ai/code/session_016JGASo49PeFSfCaDxvrGFE

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-23 22:00:18 -04:00
Thomas Pelletier 16b1ef5508 Fix parser error pointing to wrong line when last line has no trailing newline (#1041)
When parsing a key without '=' at EOF (e.g., "a = 1\nb = 2\nc"), the
error highlight was an empty slice, causing subsliceOffset to return 0
and the error to point at line 1 instead of line 3. Pass the consumed
key bytes as the highlight instead of the empty remainder.

Fixes #1032

https://claude.ai/code/session_01UWv8pyc8P1ktAPfHpveixj

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-23 21:34:12 -04:00
dependabot[bot] e14bde7c1d build(deps): bump docker/login-action from 3 to 4 (#1039)
Bumps [docker/login-action](https://github.com/docker/login-action) from 3 to 4.
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](https://github.com/docker/login-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: docker/login-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-23 21:04:21 -04:00
dependabot[bot] 4b1ff01eb3 build(deps): bump docker/setup-buildx-action from 3 to 4 (#1040)
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 3 to 4.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](https://github.com/docker/setup-buildx-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-23 21:04:11 -04:00
Thomas Pelletier 048a25f0f2 Go 1.26 (#1030)
* ci(release): drop unsupported windows/arm targets
2026-03-03 01:29:35 -05:00
dependabot[bot] b3575580f9 build(deps): bump goreleaser/goreleaser-action from 6 to 7 (#1035) 2026-03-03 00:47:47 -05:00
dependabot[bot] a0be52f4c1 build(deps): bump actions/upload-artifact from 6 to 7 (#1036) 2026-03-03 00:47:35 -05:00
Thomas Pelletier 316bfc66a4 Support Unmarshaler interface for tables and array tables (#1027)
Fixes #873

Extend the unstable.Unmarshaler interface support to work with tables
and array tables, not just single values.

When a type implementing unstable.Unmarshaler is the target of a table
(e.g., [table] or [[array]]), the UnmarshalTOML method receives a
synthetic InlineTable node containing all the key-value pairs belonging
to that table.

Key changes:
- Add handleKeyValuesUnmarshaler to collect and process table content
- Add copyExpressionNodes to deep-copy AST nodes for synthetic tables
- Add helper functions in unstable/ast.go for node manipulation
- Update documentation for EnableUnmarshalerInterface
- Add comprehensive tests for table and array table unmarshaling

* Implement bytes-based Unmarshaler interface for tables and arrays (#873)

This change brings back support for the unstable.Unmarshaler interface
for tables and array tables, addressing issue #873.

Key changes:
- Changed UnmarshalTOML signature from (*Node) to ([]byte) to provide
  raw TOML bytes instead of AST nodes
- Added RawMessage type (similar to json.RawMessage) for capturing raw
  TOML bytes for later processing
- Updated handleKeyValuesUnmarshaler to reconstruct key-value lines
  from the parsed keys and raw value bytes
- Added support for slice types implementing Unmarshaler (e.g., RawMessage)
- Removed unused AST helper functions from unstable/ast.go

The bytes-based interface allows users to:
- Get raw TOML bytes for custom parsing
- Delay TOML decoding using RawMessage
- Implement custom unmarshaling logic for complex types

Tests added for:
- Table unmarshaler with various scenarios
- Array table unmarshaler
- Split tables (same parent defined in multiple places)
- RawMessage usage
- Nested tables and mixed regular fields

* Fix lint issues and improve test coverage for Unmarshaler interface

- Apply De Morgan's law in keyNeedsQuoting to satisfy staticcheck QF1001
- Remove unused splitTableUnmarshaler type from test
- Fix unused parameter lint warning in errorUnmarshaler873
- Add test for quoted keys that need special handling
- Add test for error propagation from UnmarshalTOML
- Update customTable873 parser to handle quoted keys properly

Coverage improved:
- handleKeyValuesUnmarshaler: 80.0% -> 93.3%
- keyNeedsQuoting: 66.7% -> 83.3%
- Overall main package: 97.2% -> 97.5%

* Add test for dotted keys to improve coverage

Add TestIssue873_DottedKeys to test dotted key handling (e.g., sub.key = value)
in the Unmarshaler interface. This improves coverage for handleKeyValuesUnmarshaler
from 93.3% to 96.7%.

* Add double pointer test to achieve 100% coverage for handleKeyValues

Add TestIssue873_DoublePointerUnmarshaler to test pointer-to-pointer
to Unmarshaler types. This covers the pointer dereferencing loop in
handleKeyValues, bringing its coverage from 88% to 100%.

Total coverage: 97.4%

* Add Example tests and fix raw value extraction for boolean types

Add two godoc Example tests:
- ExampleDecoder_EnableUnmarshalerInterface_dynamicConfig: shows dynamic
  unmarshaling based on a type field
- ExampleDecoder_EnableUnmarshalerInterface_rawMessage: demonstrates
  RawMessage usage for deferred parsing

Fix handleKeyValuesUnmarshaler to handle values where Raw.Length == 0
(like boolean types) by using value.Data as fallback.

* Preserve original formatting in Unmarshaler by using raw byte ranges

Instead of reconstructing key-value lines from parsed components, now
uses the original raw bytes from the document. This preserves:
- Whitespace around '=' (e.g., "key   =   value")
- String quoting style (basic vs literal)
- Number formats (hex, octal, binary)
- Inline table formatting

Changes:
- Add Raw range tracking to KeyValue expressions in parseKeyval
- Update handleKeyValuesUnmarshaler to use expr.Raw directly
- Remove keyNeedsQuoting helper (no longer needed)
- Add TestIssue873_FormattingPreservation test
- Update expected output in ExampleParser_comments

* Prevent test matrix from canceling on first failure

Add fail-fast: false to the test workflow strategy so that all
OS/Go version combinations continue running even if one fails.
This provides better visibility into which specific combinations
have issues.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-02-11 09:57:23 -05:00
18 changed files with 633 additions and 90 deletions
+1 -1
View File
@@ -19,7 +19,7 @@ jobs:
dry-run: false
language: go
- name: Upload Crash
uses: actions/upload-artifact@v6
uses: actions/upload-artifact@v7
if: failure() && steps.build.outcome == 'success'
with:
name: artifacts
+1 -1
View File
@@ -15,6 +15,6 @@ jobs:
- name: Setup go
uses: actions/setup-go@v6
with:
go-version: "1.25"
go-version: "1.26"
- name: Run tests with coverage
run: ./ci.sh coverage -d "${GITHUB_BASE_REF-HEAD}"
+2 -2
View File
@@ -20,7 +20,7 @@ jobs:
fetch-depth: 0
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
uses: docker/setup-buildx-action@v4
- name: Run Go versions compatibility test
run: |
@@ -28,7 +28,7 @@ jobs:
./test-go-versions.sh --output ./test-results $VERSIONS
- name: Upload test results
uses: actions/upload-artifact@v6
uses: actions/upload-artifact@v7
with:
name: go-versions-test-results
path: |
+1 -1
View File
@@ -15,7 +15,7 @@ jobs:
- name: Setup go
uses: actions/setup-go@v6
with:
go-version: "1.24"
go-version: "1.26"
- name: Run golangci-lint
uses: golangci/golangci-lint-action@v9
with:
+3 -3
View File
@@ -22,15 +22,15 @@ jobs:
- name: Set up Go
uses: actions/setup-go@v6
with:
go-version: "1.25"
go-version: "1.26"
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
uses: docker/login-action@v4
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Run GoReleaser
uses: goreleaser/goreleaser-action@v6
uses: goreleaser/goreleaser-action@v7
with:
distribution: goreleaser
version: '~> v2'
+2 -1
View File
@@ -10,9 +10,10 @@ on:
jobs:
build:
strategy:
fail-fast: false
matrix:
os: [ 'ubuntu-latest', 'windows-latest', 'macos-latest', 'macos-14' ]
go: [ '1.24', '1.25' ]
go: [ '1.25', '1.26' ]
runs-on: ${{ matrix.os }}
name: ${{ matrix.go }}/${{ matrix.os }}
steps:
-3
View File
@@ -22,7 +22,6 @@ builds:
- linux_riscv64
- windows_amd64
- windows_arm64
- windows_arm
- darwin_amd64
- darwin_arm64
- id: tomljson
@@ -42,7 +41,6 @@ builds:
- linux_riscv64
- windows_amd64
- windows_arm64
- windows_arm
- darwin_amd64
- darwin_arm64
- id: jsontoml
@@ -62,7 +60,6 @@ builds:
- linux_arm
- windows_amd64
- windows_arm64
- windows_arm
- darwin_amd64
- darwin_arm64
universal_binaries:
+27 -27
View File
@@ -235,17 +235,17 @@ the AST level. See https://pkg.go.dev/github.com/pelletier/go-toml/v2/unstable.
Execution time speedup compared to other Go TOML libraries:
<table>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/HugoFrontMatter-2</td><td>1.9x</td><td>2.2x</td></tr>
<tr><td>Marshal/ReferenceFile/map-2</td><td>1.7x</td><td>2.1x</td></tr>
<tr><td>Marshal/ReferenceFile/struct-2</td><td>2.2x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/HugoFrontMatter-2</td><td>2.9x</td><td>2.7x</td></tr>
<tr><td>Unmarshal/ReferenceFile/map-2</td><td>2.6x</td><td>2.7x</td></tr>
<tr><td>Unmarshal/ReferenceFile/struct-2</td><td>4.6x</td><td>5.1x</td></tr>
</tbody>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/HugoFrontMatter-2</td><td>2.1x</td><td>2.0x</td></tr>
<tr><td>Marshal/ReferenceFile/map-2</td><td>2.0x</td><td>2.0x</td></tr>
<tr><td>Marshal/ReferenceFile/struct-2</td><td>2.3x</td><td>2.5x</td></tr>
<tr><td>Unmarshal/HugoFrontMatter-2</td><td>3.3x</td><td>2.8x</td></tr>
<tr><td>Unmarshal/ReferenceFile/map-2</td><td>2.9x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/ReferenceFile/struct-2</td><td>4.8x</td><td>5.0x</td></tr>
</tbody>
</table>
<details><summary>See more</summary>
<p>The table above has the results of the most common use-cases. The table below
@@ -253,22 +253,22 @@ contains the results of all benchmarks, including unrealistic ones. It is
provided for completeness.</p>
<table>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/SimpleDocument/map-2</td><td>1.8x</td><td>2.7x</td></tr>
<tr><td>Marshal/SimpleDocument/struct-2</td><td>2.7x</td><td>3.8x</td></tr>
<tr><td>Unmarshal/SimpleDocument/map-2</td><td>3.8x</td><td>3.0x</td></tr>
<tr><td>Unmarshal/SimpleDocument/struct-2</td><td>5.6x</td><td>4.1x</td></tr>
<tr><td>UnmarshalDataset/example-2</td><td>3.0x</td><td>3.2x</td></tr>
<tr><td>UnmarshalDataset/code-2</td><td>2.3x</td><td>2.9x</td></tr>
<tr><td>UnmarshalDataset/twitter-2</td><td>2.6x</td><td>2.7x</td></tr>
<tr><td>UnmarshalDataset/citm_catalog-2</td><td>2.2x</td><td>2.3x</td></tr>
<tr><td>UnmarshalDataset/canada-2</td><td>1.8x</td><td>1.5x</td></tr>
<tr><td>UnmarshalDataset/config-2</td><td>4.1x</td><td>2.9x</td></tr>
<tr><td>geomean</td><td>2.7x</td><td>2.8x</td></tr>
</tbody>
<thead>
<tr><th>Benchmark</th><th>go-toml v1</th><th>BurntSushi/toml</th></tr>
</thead>
<tbody>
<tr><td>Marshal/SimpleDocument/map-2</td><td>2.0x</td><td>2.9x</td></tr>
<tr><td>Marshal/SimpleDocument/struct-2</td><td>2.5x</td><td>3.6x</td></tr>
<tr><td>Unmarshal/SimpleDocument/map-2</td><td>4.2x</td><td>3.4x</td></tr>
<tr><td>Unmarshal/SimpleDocument/struct-2</td><td>5.9x</td><td>4.4x</td></tr>
<tr><td>UnmarshalDataset/example-2</td><td>3.2x</td><td>2.9x</td></tr>
<tr><td>UnmarshalDataset/code-2</td><td>2.4x</td><td>2.8x</td></tr>
<tr><td>UnmarshalDataset/twitter-2</td><td>2.7x</td><td>2.5x</td></tr>
<tr><td>UnmarshalDataset/citm_catalog-2</td><td>2.3x</td><td>2.3x</td></tr>
<tr><td>UnmarshalDataset/canada-2</td><td>1.9x</td><td>1.5x</td></tr>
<tr><td>UnmarshalDataset/config-2</td><td>5.4x</td><td>3.0x</td></tr>
<tr><td>geomean</td><td>2.9x</td><td>2.8x</td></tr>
</tbody>
</table>
<p>This table can be generated with <code>./ci.sh benchmark -a -html</code>.</p>
</details>
+6 -1
View File
@@ -147,7 +147,7 @@ bench() {
pushd "$dir"
if [ "${replace}" != "" ]; then
find ./benchmark/ -iname '*.go' -exec sed -i -E "s|github.com/pelletier/go-toml/v2|${replace}|g" {} \;
find ./benchmark/ -iname '*.go' -exec sed -i -E "s|github.com/pelletier/go-toml/v2\"|${replace}\"|g" {} \;
go get "${replace}"
fi
@@ -195,6 +195,11 @@ for line in reversed(lines[2:]):
"%.1fx" % (float(line[3])/v2), # v1
"%.1fx" % (float(line[7])/v2), # bs
])
if not results:
print("No benchmark results to display.", file=sys.stderr)
sys.exit(1)
# move geomean to the end
results.append(results[0])
del results[0]
+27
View File
@@ -259,6 +259,12 @@ func TestDecodeError_Position(t *testing.T) {
expectedRow: 3,
minCol: 5,
},
{
name: "missing equals on last line without trailing newline",
doc: "a = 1\nb = 2\nc",
expectedRow: 3,
minCol: 1,
},
}
for _, e := range examples {
@@ -280,6 +286,27 @@ func TestDecodeError_Position(t *testing.T) {
}
}
func TestDecodeError_InvalidKeyStartAfterComment(t *testing.T) {
doc := "# comment\n= \"value\""
var out map[string]string
err := Unmarshal([]byte(doc), &out)
assert.Error(t, err)
var derr *DecodeError
if !errors.As(err, &derr) {
t.Fatal("error not in expected format")
}
row, col := derr.Position()
assert.Equal(t, 2, row)
assert.Equal(t, 1, col)
assert.Equal(t, "toml: invalid character at start of key: =", derr.Error())
assert.Equal(t, `1| # comment
2| = "value"
| ~ invalid character at start of key: =`, derr.String())
}
func TestStrictErrorUnwrap(t *testing.T) {
fo := bytes.NewBufferString(`
Missing = 1
+12 -9
View File
@@ -704,15 +704,18 @@ func (enc *Encoder) encodeMap(b []byte, ctx encoderCtx, v reflect.Value) ([]byte
for iter.Next() {
v := iter.Value()
if isNil(v) {
// For nil pointers, convert to zero value of the element type.
// This allows round-trip marshaling of maps with nil pointer values.
// For nil interfaces and nil maps, skip since we can't derive a type.
if v.Kind() == reflect.Ptr {
// Handle nil values: convert nil pointers to zero value,
// skip nil interfaces and nil maps.
switch v.Kind() {
case reflect.Ptr:
if v.IsNil() {
v = reflect.Zero(v.Type().Elem())
} else {
}
case reflect.Interface, reflect.Map:
if v.IsNil() {
continue
}
default:
}
k, err := enc.keyToString(iter.Key())
@@ -936,7 +939,7 @@ func (enc *Encoder) encodeTable(b []byte, ctx encoderCtx, t table) ([]byte, erro
if shouldOmitEmpty(kv.Options, kv.Value) {
continue
}
if shouldOmitZero(kv.Options, kv.Value) {
if kv.Options.omitzero && shouldOmitZero(kv.Options, kv.Value) {
continue
}
hasNonEmptyKV = true
@@ -958,7 +961,7 @@ func (enc *Encoder) encodeTable(b []byte, ctx encoderCtx, t table) ([]byte, erro
if shouldOmitEmpty(table.Options, table.Value) {
continue
}
if shouldOmitZero(table.Options, table.Value) {
if table.Options.omitzero && shouldOmitZero(table.Options, table.Value) {
continue
}
if first {
@@ -995,7 +998,7 @@ func (enc *Encoder) encodeTableInline(b []byte, ctx encoderCtx, t table) ([]byte
if shouldOmitEmpty(kv.Options, kv.Value) {
continue
}
if shouldOmitZero(kv.Options, kv.Value) {
if kv.Options.omitzero && shouldOmitZero(kv.Options, kv.Value) {
continue
}
+5 -4
View File
@@ -9,7 +9,7 @@ YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
# Go versions to test (1.11 through 1.25)
# Go versions to test (1.11 through 1.26)
GO_VERSIONS=(
"1.11"
"1.12"
@@ -26,6 +26,7 @@ GO_VERSIONS=(
"1.23"
"1.24"
"1.25"
"1.26"
)
# Default values
@@ -64,7 +65,7 @@ EXAMPLES:
$0 # Test all Go versions in parallel
$0 --sequential # Test all Go versions sequentially
$0 1.21 1.22 1.23 # Test specific versions
$0 --verbose --output ./results 1.24 1.25 # Verbose output to custom directory
$0 --verbose --output ./results 1.25 1.26 # Verbose output to custom directory
EXIT CODES:
0 Recent Go versions pass (good compatibility)
@@ -136,8 +137,8 @@ fi
# Validate Go versions
for version in "${GO_VERSIONS[@]}"; do
if ! [[ "$version" =~ ^1\.(1[1-9]|2[0-5])$ ]]; then
log_error "Invalid Go version: $version. Supported versions: 1.11-1.25"
if ! [[ "$version" =~ ^1\.(1[1-9]|2[0-6])$ ]]; then
log_error "Invalid Go version: $version. Supported versions: 1.11-1.26"
exit 1
fi
done
+85 -15
View File
@@ -56,13 +56,18 @@ func (d *Decoder) DisallowUnknownFields() *Decoder {
// EnableUnmarshalerInterface allows to enable unmarshaler interface.
//
// With this feature enabled, types implementing the unstable/Unmarshaler
// With this feature enabled, types implementing the unstable.Unmarshaler
// interface can be decoded from any structure of the document. It allows types
// that don't have a straightforward TOML representation to provide their own
// decoding logic.
//
// Currently, types can only decode from a single value. Tables and array tables
// are not supported.
// The UnmarshalTOML method receives raw TOML bytes:
// - For single values: the raw value bytes (e.g., `"hello"` for a string)
// - For tables: all key-value lines belonging to that table
// - For inline tables/arrays: the raw bytes of the inline structure
//
// The unstable.RawMessage type can be used to capture raw TOML bytes for
// later processing, similar to json.RawMessage.
//
// *Unstable:* This method does not follow the compatibility guarantees of
// semver. It can be changed or removed without a new major version being
@@ -599,18 +604,28 @@ func (d *decoder) handleArrayTablePart(key unstable.Iterator, v reflect.Value) (
// cannot handle it.
func (d *decoder) handleTable(key unstable.Iterator, v reflect.Value) (reflect.Value, error) {
if v.Kind() == reflect.Slice {
if v.Len() == 0 {
return reflect.Value{}, unstable.NewParserError(key.Node().Data, "cannot store a table in a slice")
// For non-empty slices, work with the last element
if v.Len() > 0 {
elem := v.Index(v.Len() - 1)
x, err := d.handleTable(key, elem)
if err != nil {
return reflect.Value{}, err
}
if x.IsValid() {
elem.Set(x)
}
return reflect.Value{}, nil
}
elem := v.Index(v.Len() - 1)
x, err := d.handleTable(key, elem)
if err != nil {
return reflect.Value{}, err
// Empty slice - check if it implements Unmarshaler (e.g., RawMessage)
// and we're at the end of the key path
if d.unmarshalerInterface && !key.Next() {
if v.CanAddr() && v.Addr().CanInterface() {
if outi, ok := v.Addr().Interface().(unstable.Unmarshaler); ok {
return d.handleKeyValuesUnmarshaler(outi)
}
}
}
if x.IsValid() {
elem.Set(x)
}
return reflect.Value{}, nil
return reflect.Value{}, unstable.NewParserError(key.Node().Data, "cannot store a table in a slice")
}
if key.Next() {
// Still scoping the key
@@ -624,6 +639,24 @@ func (d *decoder) handleTable(key unstable.Iterator, v reflect.Value) (reflect.V
// Handle root expressions until the end of the document or the next
// non-key-value.
func (d *decoder) handleKeyValues(v reflect.Value) (reflect.Value, error) {
// Check if target implements Unmarshaler before processing key-values.
// This allows types to handle entire tables themselves.
if d.unmarshalerInterface {
vv := v
for vv.Kind() == reflect.Ptr {
if vv.IsNil() {
vv.Set(reflect.New(vv.Type().Elem()))
}
vv = vv.Elem()
}
if vv.CanAddr() && vv.Addr().CanInterface() {
if outi, ok := vv.Addr().Interface().(unstable.Unmarshaler); ok {
// Collect all key-value expressions for this table
return d.handleKeyValuesUnmarshaler(outi)
}
}
}
var rv reflect.Value
for d.nextExpr() {
expr := d.expr()
@@ -653,6 +686,41 @@ func (d *decoder) handleKeyValues(v reflect.Value) (reflect.Value, error) {
return rv, nil
}
// handleKeyValuesUnmarshaler collects all key-value expressions for a table
// and passes them to the Unmarshaler as raw TOML bytes.
func (d *decoder) handleKeyValuesUnmarshaler(u unstable.Unmarshaler) (reflect.Value, error) {
// Collect raw bytes from all key-value expressions for this table.
// We use the Raw field on each KeyValue expression to preserve the
// original formatting (whitespace, quoting style, etc.) from the document.
var buf []byte
for d.nextExpr() {
expr := d.expr()
if expr.Kind != unstable.KeyValue {
d.stashExpr()
break
}
_, err := d.seen.CheckExpression(expr)
if err != nil {
return reflect.Value{}, err
}
// Use the raw bytes from the original document to preserve formatting
if expr.Raw.Length > 0 {
raw := d.p.Raw(expr.Raw)
buf = append(buf, raw...)
}
buf = append(buf, '\n')
}
if err := u.UnmarshalTOML(buf); err != nil {
return reflect.Value{}, err
}
return reflect.Value{}, nil
}
type (
handlerFn func(key unstable.Iterator, v reflect.Value) (reflect.Value, error)
valueMakerFn func() reflect.Value
@@ -697,7 +765,8 @@ func (d *decoder) handleValue(value *unstable.Node, v reflect.Value) error {
if d.unmarshalerInterface {
if v.CanAddr() && v.Addr().CanInterface() {
if outi, ok := v.Addr().Interface().(unstable.Unmarshaler); ok {
return outi.UnmarshalTOML(value)
// Pass raw bytes from the original document
return outi.UnmarshalTOML(d.p.Raw(value.Raw))
}
}
}
@@ -1201,7 +1270,8 @@ func (d *decoder) handleKeyValuePart(key unstable.Iterator, value *unstable.Node
if d.unmarshalerInterface {
if v.CanAddr() && v.Addr().CanInterface() {
if outi, ok := v.Addr().Interface().(unstable.Unmarshaler); ok {
return reflect.Value{}, outi.UnmarshalTOML(value)
// Pass raw bytes from the original document
return reflect.Value{}, outi.UnmarshalTOML(d.p.Raw(value.Raw))
}
}
}
+395 -6
View File
@@ -96,6 +96,132 @@ func ExampleUnmarshal() {
// tags: [go toml]
}
// pluginConfig demonstrates how to implement dynamic unmarshaling
// based on a "type" field. This pattern is useful for plugin systems
// or polymorphic configuration.
type pluginConfig struct {
Type string
Config any
}
func (p *pluginConfig) UnmarshalTOML(data []byte) error {
// First, decode just the type field
var typeOnly struct {
Type string `toml:"type"`
}
if err := toml.Unmarshal(data, &typeOnly); err != nil {
return err
}
p.Type = typeOnly.Type
// Now decode the config based on the type
switch typeOnly.Type {
case "database":
var cfg struct {
Type string `toml:"type"`
Host string `toml:"host"`
Port int `toml:"port"`
}
if err := toml.Unmarshal(data, &cfg); err != nil {
return err
}
p.Config = map[string]any{"host": cfg.Host, "port": cfg.Port}
case "cache":
var cfg struct {
Type string `toml:"type"`
TTL int `toml:"ttl"`
}
if err := toml.Unmarshal(data, &cfg); err != nil {
return err
}
p.Config = map[string]any{"ttl": cfg.TTL}
}
return nil
}
// This example demonstrates dynamic unmarshaling based on a discriminator
// field. The pluginConfig type uses UnmarshalTOML to first read the "type"
// field, then decode the rest of the configuration based on that type.
// This pattern is useful for plugin systems or configuration that varies
// by type.
func ExampleDecoder_EnableUnmarshalerInterface_dynamicConfig() {
doc := `
[[plugins]]
type = "database"
host = "localhost"
port = 5432
[[plugins]]
type = "cache"
ttl = 300
`
type Config struct {
Plugins []pluginConfig `toml:"plugins"`
}
var cfg Config
err := toml.NewDecoder(strings.NewReader(doc)).
EnableUnmarshalerInterface().
Decode(&cfg)
if err != nil {
panic(err)
}
for _, p := range cfg.Plugins {
fmt.Printf("type=%s config=%v\n", p.Type, p.Config)
}
// Output:
// type=database config=map[host:localhost port:5432]
// type=cache config=map[ttl:300]
}
// This example demonstrates using RawMessage to capture raw TOML bytes
// for later processing. RawMessage is similar to json.RawMessage - it
// delays decoding so you can inspect the raw content or decode it
// differently based on context.
func ExampleDecoder_EnableUnmarshalerInterface_rawMessage() {
doc := `
[plugin]
name = "example"
version = "1.0"
enabled = true
`
type Config struct {
Plugin unstable.RawMessage `toml:"plugin"`
}
var cfg Config
err := toml.NewDecoder(strings.NewReader(doc)).
EnableUnmarshalerInterface().
Decode(&cfg)
if err != nil {
panic(err)
}
// cfg.Plugin contains the raw TOML bytes
fmt.Printf("Raw TOML captured:\n%s", cfg.Plugin)
// You can later decode it into a specific type
var plugin struct {
Name string `toml:"name"`
Version string `toml:"version"`
Enabled bool `toml:"enabled"`
}
if err := toml.Unmarshal(cfg.Plugin, &plugin); err != nil {
panic(err)
}
fmt.Printf("Decoded: name=%s version=%s enabled=%v\n",
plugin.Name, plugin.Version, plugin.Enabled)
// Output:
// Raw TOML captured:
// name = "example"
// version = "1.0"
// enabled = true
// Decoded: name=example version=1.0 enabled=true
}
type badReader struct{}
func (r *badReader) Read([]byte) (int, error) {
@@ -3900,8 +4026,8 @@ type CustomUnmarshalerKey struct {
A int64
}
func (k *CustomUnmarshalerKey) UnmarshalTOML(value *unstable.Node) error {
item, err := strconv.ParseInt(string(value.Data), 10, 64)
func (k *CustomUnmarshalerKey) UnmarshalTOML(data []byte) error {
item, err := strconv.ParseInt(string(data), 10, 64)
if err != nil {
return fmt.Errorf("error converting to int64, %w", err)
}
@@ -3989,7 +4115,7 @@ foo = "bar"`,
type doc994 struct{}
func (d *doc994) UnmarshalTOML(*unstable.Node) error {
func (d *doc994) UnmarshalTOML([]byte) error {
return errors.New("expected-error")
}
@@ -4012,8 +4138,8 @@ type doc994ok struct {
S string
}
func (d *doc994ok) UnmarshalTOML(value *unstable.Node) error {
d.S = string(value.Data) + " from unmarshaler"
func (d *doc994ok) UnmarshalTOML(data []byte) error {
d.S = string(data) + " from unmarshaler"
return nil
}
@@ -4026,7 +4152,8 @@ func TestIssue994_OK(t *testing.T) {
Decode(&d)
assert.NoError(t, err)
assert.Equal(t, "bar from unmarshaler", d.S)
// With bytes-based interface, raw TOML bytes are passed including quotes
assert.Equal(t, "\"bar\" from unmarshaler", d.S)
}
func TestIssue995(t *testing.T) {
@@ -4385,3 +4512,265 @@ func TestIssue1028(t *testing.T) {
assert.Error(t, err)
})
}
// Tests for issue #873 - Bring back toml.Unmarshaler for tables and arrays
type customTable873 struct {
Keys []string
Values map[string]string
}
func (c *customTable873) UnmarshalTOML(data []byte) error {
c.Keys = []string{}
c.Values = make(map[string]string)
// Parse the raw TOML bytes into a map to extract keys in order
// For this test, we use a simple line-by-line parser to preserve order
lines := bytes.Split(data, []byte{'\n'})
for _, line := range lines {
line = bytes.TrimSpace(line)
if len(line) == 0 {
continue
}
// Skip table headers
if line[0] == '[' {
continue
}
// Parse key = value
eqIdx := bytes.Index(line, []byte{'='})
if eqIdx < 0 {
continue
}
key := string(bytes.TrimSpace(line[:eqIdx]))
// Remove quotes from quoted keys
if len(key) >= 2 && key[0] == '"' && key[len(key)-1] == '"' {
key = key[1 : len(key)-1]
}
valueBytes := bytes.TrimSpace(line[eqIdx+1:])
// Remove quotes from string values
if len(valueBytes) >= 2 && valueBytes[0] == '"' && valueBytes[len(valueBytes)-1] == '"' {
valueBytes = valueBytes[1 : len(valueBytes)-1]
}
c.Keys = append(c.Keys, key)
c.Values[key] = string(valueBytes)
}
return nil
}
// Test for split tables - when the same parent table is defined in multiple places
// This is a key requirement for issue #873: if type A implements Unmarshaler,
// and [a.b] and [a.d] are defined with another table [x] in between,
// A should receive content for both b and d, but not x.
func TestIssue873_SplitTables(t *testing.T) {
// For this test, we expect each sub-table to be handled separately
// The parent doesn't receive the sub-tables directly - each sub-table
// (b and d) gets its own call to handleKeyValues
type Config struct {
A struct {
B customTable873 `toml:"b"`
D customTable873 `toml:"d"`
} `toml:"a"`
X customTable873 `toml:"x"`
}
doc := `
[a.b]
C = "1"
[x]
Y = "100"
[a.d]
E = "2"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
// Each sub-table should have received its own key-values
assert.Equal(t, []string{"C"}, cfg.A.B.Keys)
assert.Equal(t, "1", cfg.A.B.Values["C"])
assert.Equal(t, []string{"E"}, cfg.A.D.Keys)
assert.Equal(t, "2", cfg.A.D.Values["E"])
assert.Equal(t, []string{"Y"}, cfg.X.Keys)
assert.Equal(t, "100", cfg.X.Values["Y"])
}
// Test using RawMessage to capture raw TOML bytes
func TestIssue873_RawMessage(t *testing.T) {
type Config struct {
Plugin unstable.RawMessage `toml:"plugin"`
}
doc := `
[plugin]
name = "example"
version = "1.0"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
// RawMessage should contain the raw key-value bytes
expected := "name = \"example\"\nversion = \"1.0\"\n"
assert.Equal(t, expected, string(cfg.Plugin))
}
// Test keys that need quoting (contain special characters)
func TestIssue873_QuotedKeys(t *testing.T) {
type Config struct {
Section customTable873 `toml:"section"`
}
doc := `
[section]
"key with spaces" = "value1"
"key.with.dots" = "value2"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, 2, len(cfg.Section.Keys))
assert.Equal(t, "value1", cfg.Section.Values["key with spaces"])
assert.Equal(t, "value2", cfg.Section.Values["key.with.dots"])
}
// errorUnmarshaler873 is used to test error propagation from UnmarshalTOML
type errorUnmarshaler873 struct{}
func (e *errorUnmarshaler873) UnmarshalTOML([]byte) error {
return errors.New("intentional error")
}
// Test error propagation from UnmarshalTOML
func TestIssue873_UnmarshalerError(t *testing.T) {
doc := `
[section]
key = "value"
`
type Config struct {
Section errorUnmarshaler873 `toml:"section"`
}
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.Error(t, err)
assert.True(t, strings.Contains(err.Error(), "intentional error"))
}
// Test dotted keys in a table (e.g., a.b = value)
func TestIssue873_DottedKeys(t *testing.T) {
type Config struct {
Section customTable873 `toml:"section"`
}
doc := `
[section]
sub.key = "value1"
another.nested.key = "value2"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.Equal(t, 2, len(cfg.Section.Keys))
// The dotted keys should be preserved in the raw output
assert.Equal(t, "value1", cfg.Section.Values["sub.key"])
assert.Equal(t, "value2", cfg.Section.Values["another.nested.key"])
}
// Test pointer to pointer to Unmarshaler (covers pointer dereferencing loop)
func TestIssue873_DoublePointerUnmarshaler(t *testing.T) {
type Config struct {
Section **customTable873 `toml:"section"`
}
doc := `
[section]
key = "value"
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.True(t, cfg.Section != nil)
assert.True(t, *cfg.Section != nil)
assert.Equal(t, []string{"key"}, (*cfg.Section).Keys)
assert.Equal(t, "value", (*cfg.Section).Values["key"])
}
// formattingCapture captures the raw TOML bytes to verify formatting preservation
type formattingCapture struct {
RawBytes string
}
func (f *formattingCapture) UnmarshalTOML(data []byte) error {
f.RawBytes = string(data)
return nil
}
func TestIssue873_FormattingPreservation(t *testing.T) {
type Config struct {
Section *formattingCapture `toml:"section"`
}
// Test that various formatting styles are preserved:
// - Extra spaces around '='
// - Literal strings (single quotes)
// - Hex numbers
// - Inline tables
doc := `[section]
key1 = "value with spaces"
key2 = 'literal string'
hex_val = 0xDEADBEEF
inline = { a = 1, b = 2 }
`
var cfg Config
err := toml.NewDecoder(bytes.NewReader([]byte(doc))).
EnableUnmarshalerInterface().
Decode(&cfg)
assert.NoError(t, err)
assert.True(t, cfg.Section != nil)
// The raw bytes should preserve original formatting
raw := cfg.Section.RawBytes
// Check that extra spaces around '=' are preserved
assert.True(t, strings.Contains(raw, "key1 = \"value with spaces\""),
"Expected spacing to be preserved, got: %s", raw)
// Check that literal string style is preserved
assert.True(t, strings.Contains(raw, "key2 = 'literal string'"),
"Expected literal string to be preserved, got: %s", raw)
// Check that hex format is preserved
assert.True(t, strings.Contains(raw, "hex_val = 0xDEADBEEF"),
"Expected hex format to be preserved, got: %s", raw)
// Check that inline table is preserved
assert.True(t, strings.Contains(raw, "inline = { a = 1, b = 2 }"),
"Expected inline table to be preserved, got: %s", raw)
}
+7 -3
View File
@@ -28,12 +28,16 @@ func (c *Iterator) Next() bool {
if c.nodes == nil {
return false
}
nodes := *c.nodes
if !c.started {
c.started = true
} else if c.idx >= 0 {
c.idx = (*c.nodes)[c.idx].next
} else {
idx := c.idx
if idx >= 0 && int(idx) < len(nodes) {
c.idx = nodes[idx].next
}
}
return c.idx >= 0 && int(c.idx) < len(*c.nodes)
return c.idx >= 0 && int(c.idx) < len(nodes)
}
// IsLast returns true if the current node of the iterator is the last
+25 -4
View File
@@ -3,6 +3,7 @@ package unstable
import (
"bytes"
"fmt"
"reflect"
"unicode"
"github.com/pelletier/go-toml/v2/internal/characters"
@@ -83,10 +84,22 @@ func (p *Parser) rangeOfToken(token, rest []byte) Range {
}
// subsliceOffset returns the byte offset of subslice b within p.data.
// b must be a suffix (tail) of p.data.
// b must share the same backing array as p.data.
func (p *Parser) subsliceOffset(b []byte) int {
// b is a suffix of p.data, so its offset is len(p.data) - len(b)
return len(p.data) - len(b)
if len(b) == 0 {
// Most callers pass suffix slices, so preserve EOF behavior.
return len(p.data)
}
dataPtr := reflect.ValueOf(p.data).Pointer()
subPtr := reflect.ValueOf(b).Pointer()
offset := int(subPtr - dataPtr)
if offset < 0 || offset+len(b) > len(p.data) {
panic("subslice is not within parser input")
}
return offset
}
// Raw returns the slice corresponding to the bytes in the given range.
@@ -328,6 +341,9 @@ func (p *Parser) parseStdTable(b []byte) (reference, []byte, error) {
func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
// keyval = key keyval-sep val
// Track the start position for Raw range
startB := b
ref := p.builder.Push(Node{
Kind: KeyValue,
})
@@ -342,7 +358,7 @@ func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
b = p.parseWhitespace(b)
if len(b) == 0 {
return invalidReference, nil, NewParserError(b, "expected = after a key, but the document ends there")
return invalidReference, nil, NewParserError(startB[:len(startB)-len(b)], "expected = after a key, but the document ends there")
}
b, err = expect('=', b)
@@ -360,6 +376,11 @@ func (p *Parser) parseKeyval(b []byte) (reference, []byte, error) {
p.builder.Chain(valRef, key)
p.builder.AttachChild(ref, valRef)
// Set Raw to span the entire key-value expression.
// Access the node directly in the slice to avoid the write barrier
// that NodeAt's nodes-pointer setup would trigger.
p.builder.tree.nodes[ref].Raw = p.rangeOfToken(startB[:len(startB)-len(b)], b)
return ref, b, err
}
+6 -6
View File
@@ -539,7 +539,7 @@ key5 = [ # Next to start of inline array.
// ---
// 6:1->6:22 (105->126) | Comment [# Above simple value.]
// ---
// 1:1->1:1 (0->0) | KeyValue []
// 7:1->7:14 (127->140) | KeyValue []
// 7:7->7:14 (133->140) | String [value]
// 7:1->7:4 (127->130) | Key [key]
// 7:15->7:38 (141->164) | Comment [# Next to simple value.]
@@ -552,12 +552,12 @@ key5 = [ # Next to start of inline array.
// ---
// 14:1->14:22 (252->273) | Comment [# Above inline table.]
// ---
// 1:1->1:1 (0->0) | KeyValue []
// 15:1->15:50 (274->323) | KeyValue []
// 15:8->15:9 (281->282) | InlineTable []
// 1:1->1:1 (0->0) | KeyValue []
// 15:10->15:23 (283->296) | KeyValue []
// 15:18->15:23 (291->296) | String [Tom]
// 15:10->15:15 (283->288) | Key [first]
// 1:1->1:1 (0->0) | KeyValue []
// 15:25->15:48 (298->321) | KeyValue []
// 15:32->15:48 (305->321) | String [Preston-Werner]
// 15:25->15:29 (298->302) | Key [last]
// 15:1->15:5 (274->278) | Key [name]
@@ -567,7 +567,7 @@ key5 = [ # Next to start of inline array.
// ---
// 18:1->18:15 (371->385) | Comment [# Above array.]
// ---
// 1:1->1:1 (0->0) | KeyValue []
// 19:1->19:20 (386->405) | KeyValue []
// 1:1->1:1 (0->0) | Array []
// 19:11->19:12 (396->397) | Integer [1]
// 19:14->19:15 (399->400) | Integer [2]
@@ -579,7 +579,7 @@ key5 = [ # Next to start of inline array.
// ---
// 22:1->22:26 (448->473) | Comment [# Above multi-line array.]
// ---
// 1:1->1:1 (0->0) | KeyValue []
// 23:1->31:2 (474->694) | KeyValue []
// 1:1->1:1 (0->0) | Array []
// 23:10->23:42 (483->515) | Comment [# Next to start of inline array.]
// 24:3->24:38 (518->553) | Comment [# Second line before array content.]
+28 -3
View File
@@ -1,7 +1,32 @@
package unstable
// The Unmarshaler interface may be implemented by types to customize their
// behavior when being unmarshaled from a TOML document.
// Unmarshaler is implemented by types that can unmarshal a TOML
// description of themselves. The input is a valid TOML document
// containing the relevant portion of the parsed document.
//
// For tables (including split tables defined in multiple places),
// the data contains the raw key-value bytes from the original document
// with adjusted table headers to be relative to the unmarshaling target.
type Unmarshaler interface {
UnmarshalTOML(value *Node) error
UnmarshalTOML(data []byte) error
}
// RawMessage is a raw encoded TOML value. It implements Unmarshaler
// and can be used to delay TOML decoding or capture raw content.
//
// Example usage:
//
// type Config struct {
// Plugin RawMessage `toml:"plugin"`
// }
//
// var cfg Config
// toml.NewDecoder(r).EnableUnmarshalerInterface().Decode(&cfg)
// // cfg.Plugin now contains the raw TOML bytes for [plugin]
type RawMessage []byte
// UnmarshalTOML implements Unmarshaler.
func (m *RawMessage) UnmarshalTOML(data []byte) error {
*m = append((*m)[0:0], data...)
return nil
}