Commit Graph

25 Commits

Author SHA1 Message Date
Cursor Agent a646ffd9fa Make error position tracking explicit with Offset field on ParserError
Thread byte offset information through all error creation sites,
eliminating the need for SubsliceOffset to recover position from
pointer comparison.

Changes:
- Add Offset field to ParserError struct
- Add offset parameter to NewParserError
- Add Parser.offsetOf helper for suffix-length arithmetic
- Thread base offset through scanner functions (scanComment,
  scanBasicString, scanMultilineBasicString, scanLiteralString,
  scanMultilineLiteralString, scanWindowsNewline)
- Thread base offset through standalone functions (expect, hexToRune)
- Thread base offset through all decode functions (parseInteger,
  parseFloat, parseLocalDate, parseLocalTime, parseLocalDateTime,
  parseDateTime, checkAndRemoveUnderscores*)
- Update all unmarshaler call sites to pass value.Raw.Offset
- Update localtime.go UnmarshalText methods with base=0
- Update strict.go to populate Offset from key ranges
- Change wrapDecodeError to read de.Offset directly
- Change Utf8TomlValidAlreadyEscaped to return int index (-1 if valid)
  instead of a byte subslice
- Unexport SubsliceOffset (now only used internally by Range())

This makes error positions self-describing: each ParserError carries its
own byte offset, so callers no longer need the original document slice
and address arithmetic to determine where an error occurred.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 19:08:55 +00:00
Cursor Agent d75117e61f Consolidate subslice offset into a single SubsliceOffset function
Remove the private subsliceOffset methods from both parser.go and
errors.go. Replace them with a single exported SubsliceOffset function
in ast.go (next to the Range type it serves).

SubsliceOffset finds the byte offset by comparing element addresses:
&data[i] == &subslice[0]. This is well-defined Go pointer comparison
on elements of the same backing array.

This fixes the v2.3.0 regression (#1047) where the parser's
subsliceOffset used len(data) - len(b), which only works for suffix
slices, not arbitrary subslices like error highlights. It also removes
the reflect-based implementation from errors.go.

Fixes #1047

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 18:33:22 +00:00
Cursor Agent 19174a4293 Remove cap tricks, use address comparison for subslice offset
Replace cap(parent) - cap(subslice) with a straightforward scan
that compares element addresses: &data[i] == &subslice[0]. This is
well-defined Go pointer comparison on elements of the same backing
array, with no dependency on capacity semantics, reflect, or unsafe.

The scan is O(n) but only runs on error paths, and TOML documents
are small per the project's design constraints.

Also remove the Offset field from ParserError and the setErrOffset
machinery — the offset is computed at the point of consumption
(wrapDecodeError, Parser.Range) rather than cached on the error.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 18:17:55 +00:00
Cursor Agent 96ac48eb74 Remove optional offset and fallback, guarantee offset by construction
ParserError.Offset is now a plain exported int field, always set:
- The parser sets it via setErrOffset() when capturing parse errors
- strict.go sets it from the key's Raw range at construction
- wrapDecodeError computes it inline from cap(document) - cap(highlight)

This eliminates:
- The SetOffset/Offset() accessor methods and offsetValid flag
- The subsliceOffset fallback function in errors.go
- Any conditional logic around whether the offset is present

The offset is guaranteed by construction at every path that creates
or consumes a ParserError.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 17:25:32 +00:00
Cursor Agent f7136d052b Replace reflect-based subslice offset with cap arithmetic
Use cap(parent) - cap(subslice) to compute byte offsets between slices
that share a backing array. This is safe pure Go: subslicing preserves
the backing array and adjusts the capacity accordingly, so the
difference in capacities equals the byte offset.

This removes the reflect import from both errors.go and
unstable/parser.go, eliminating the last reflect-based pointer
arithmetic used for error position tracking.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 17:08:58 +00:00
Cursor Agent 154d80392f Cache error offset in ParserError for safer position tracking
Instead of requiring downstream consumers to re-derive the byte offset
from pointer arithmetic on the Highlight slice, compute and cache the
offset inside the parser at error-capture time via setErrOffset().

This is safer because:
- The parser is the one place where the backing-array guarantee is known
  to hold (Highlight is always a subslice of the parse buffer)
- Downstream consumers (wrapDecodeError) can use the cached offset
  directly, avoiding the need for pointer comparison
- Errors created outside the parser (strict.go) set the offset from
  existing Raw ranges, which are already correct by construction

Add ParserError.SetOffset/Offset methods for setting and retrieving the
cached offset. Update wrapDecodeError to prefer the cached offset when
available, falling back to subsliceOffset for backward compatibility.

Co-authored-by: Thomas Pelletier <thomas@pelletier.dev>
2026-04-12 13:00:38 +00:00
Thomas Pelletier 003aa0993b Fix nil pointer map values not being marshaled (#1025)
When marshaling a map with nil pointer values, the keys were being
silently dropped, breaking round-trip fidelity. For example:

    map[string]*struct{}{"foo": nil}

Would produce an empty TOML document instead of "[foo]".

This change converts nil pointer values in maps to their zero values
(consistent with how nil pointers in slices are handled), allowing the
keys to be preserved as empty tables.

Nil interface values (map[string]any{"foo": nil}) are still skipped
since there's no type information to derive a zero value.

Fixes #975

Also, pin golangci-lint version to v2.8.0 in CI and document in AGENTS.md

- Explicitly set golangci-lint version in lint.yml to ensure consistent
  behavior across CI runs
- Update AGENTS.md with instructions to use the same linter version locally

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-09 11:08:31 -05:00
Thomas Pelletier 3aaf147e3e Remove unsafe package usage (#1021)
Removes all unsafe operations from go-toml, making the codebase
fully safe Go code. The internal/danger package that contained
unsafe operations has been deleted.

Changes:
- Replace pointer-based node navigation with index-based navigation
- Node.next and Node.child now store absolute indices into the
  backing nodes slice instead of relative offsets
- Add nodes pointer to Node and Iterator for safe navigation
- Replace danger.TypeID with reflect.Type for cache keys
- Delete internal/danger package entirely

Performance overhead is under 10% compared to the unsafe version,
which is acceptable for the safety and maintainability benefits.

[Cursor][claude-sonnet-4-20250514]
2026-01-04 13:16:47 -05:00
Nathan Baulch a675c6b3e2 Upgrade to golangci-lint v2 (#1008) 2026-01-04 09:54:29 -05:00
Étienne BERSAC 4369957cb4 Unwrap strict errors (#1012) 2025-12-21 16:20:24 +01:00
Thomas Pelletier e195b58fd0 Expose parser API as unstable (#827) 2022-11-09 16:12:39 -05:00
Thomas Pelletier 67bc5422f3 Go 1.19 (#802) 2022-08-15 10:56:33 -04:00
Thomas Pelletier e83cf535f5 Decoder: rename SetStrict to DisallowUnknownFields (#731) 2022-01-02 14:32:34 -05:00
Thomas Pelletier 4a5ae9e81e errors: fix context generation with only one line 2021-09-07 10:36:22 -04:00
Thomas Pelletier 618f0181ac AST Tweaks (#551)
* Use pointers instead of copying around ast.Node

Node is a 56B struct that is constantly in the hot path. Passing nodes
around by copy had a cost that started to add up. This change replaces
them by pointers. Using unsafe pointer arithmetic and converting
sibling/child indexes to relative offsets, it removes the need to carry
around a pointer to the root of the tree. This saves 8B per Node. This
space will be used to store an extra []byte slice to provide contextual
error handling on all nodes, including the ones whose data is different
than the raw input (for example: strings with escaped characters), while
staying under the size of a cache line.

* Remove conditional

* Add Raw to track range in data for parsed values

* Simplify reference tracking
2021-06-03 21:48:51 -04:00
Thomas Pelletier 95c701b253 Increase test coverage (#538)
Also fix array in map bug.
2021-05-10 20:17:05 -04:00
Thomas Pelletier 45ea20024b Readme (#535) 2021-05-08 17:03:51 -04:00
Thomas Pelletier ea225df3ed v2: errors (#534)
```
name                              old time/op    new time/op    delta
UnmarshalDataset/config-32          86.7ms ± 2%    87.5ms ± 2%     ~     (p=0.113 n=9+10)
UnmarshalDataset/canada-32           129ms ± 4%     106ms ± 3%  -17.94%  (p=0.000 n=10+10)
UnmarshalDataset/citm_catalog-32    59.4ms ± 5%    58.7ms ± 5%     ~     (p=0.393 n=10+10)
UnmarshalDataset/twitter-32         27.0ms ± 7%    26.9ms ± 6%     ~     (p=0.720 n=10+9)
UnmarshalDataset/code-32             326ms ± 4%     322ms ± 7%     ~     (p=0.661 n=9+10)
UnmarshalDataset/example-32          510µs ±11%     526µs ± 7%     ~     (p=0.182 n=10+9)
UnmarshalSimple-32                  1.41µs ± 6%    1.41µs ± 4%     ~     (p=0.736 n=10+9)
ReferenceFile-32                    45.6µs ± 3%    43.9µs ±10%     ~     (p=0.089 n=10+10)

name                              old speed      new speed      delta
UnmarshalDataset/config-32        12.1MB/s ± 2%  12.0MB/s ± 2%     ~     (p=0.108 n=9+10)
UnmarshalDataset/canada-32        17.1MB/s ± 4%  20.9MB/s ± 3%  +21.86%  (p=0.000 n=10+10)
UnmarshalDataset/citm_catalog-32  9.41MB/s ± 5%  9.51MB/s ± 5%     ~     (p=0.362 n=10+10)
UnmarshalDataset/twitter-32       16.4MB/s ± 8%  16.5MB/s ± 6%     ~     (p=0.704 n=10+9)
UnmarshalDataset/code-32          8.24MB/s ± 4%  8.34MB/s ± 7%     ~     (p=0.675 n=9+10)
UnmarshalDataset/example-32       15.9MB/s ±11%  15.4MB/s ± 7%     ~     (p=0.182 n=10+9)
ReferenceFile-32                   115MB/s ± 4%   120MB/s ±10%     ~     (p=0.085 n=10+10)

name                              old alloc/op   new alloc/op   delta
UnmarshalDataset/config-32          16.9MB ± 0%    16.9MB ± 0%   -0.02%  (p=0.000 n=10+10)
UnmarshalDataset/canada-32          76.8MB ± 0%    74.3MB ± 0%   -3.31%  (p=0.000 n=10+10)
UnmarshalDataset/citm_catalog-32    37.3MB ± 0%    37.1MB ± 0%   -0.60%  (p=0.000 n=9+10)
UnmarshalDataset/twitter-32         15.6MB ± 0%    15.6MB ± 0%   -0.09%  (p=0.000 n=10+10)
UnmarshalDataset/code-32            60.2MB ± 0%    59.3MB ± 0%   -1.51%  (p=0.000 n=10+9)
UnmarshalDataset/example-32          238kB ± 0%     238kB ± 0%   -0.18%  (p=0.000 n=10+10)
ReferenceFile-32                    11.8kB ± 0%    11.8kB ± 0%     ~     (all equal)

name                              old allocs/op  new allocs/op  delta
UnmarshalDataset/config-32            653k ± 0%      645k ± 0%   -1.20%  (p=0.000 n=10+6)
UnmarshalDataset/canada-32           1.01M ± 0%     0.90M ± 0%  -11.04%  (p=0.000 n=9+10)
UnmarshalDataset/citm_catalog-32      384k ± 0%      370k ± 0%   -3.75%  (p=0.000 n=10+10)
UnmarshalDataset/twitter-32           160k ± 0%      157k ± 0%   -1.32%  (p=0.000 n=10+10)
UnmarshalDataset/code-32             2.97M ± 0%     2.91M ± 0%   -2.15%  (p=0.000 n=10+7)
UnmarshalDataset/example-32          3.69k ± 0%     3.63k ± 0%   -1.52%  (p=0.000 n=10+10)
ReferenceFile-32                       253 ± 0%       253 ± 0%     ~     (all equal)
```
2021-05-08 16:04:25 -04:00
Vincent Serpoul 2b1c52dddd golangci-lint: decoder/unmarshal (#518) 2021-04-22 09:29:23 -04:00
Thomas Pelletier 9b67e40640 decoder: strict mode (#512) 2021-04-20 21:26:22 -04:00
Vincent Serpoul 59cddbc573 Golangci-lint v2 part two (#498) 2021-04-15 10:29:46 -04:00
Thomas Pelletier 92b16cad91 Simplify context implementation and fix new lines bug 2021-03-31 09:57:19 -04:00
Thomas Pelletier 32da85ab11 Decoding error position tracking 2021-03-30 21:43:57 -04:00
Thomas Pelletier 18d45c446b wip: decoder errors 2021-03-30 19:52:02 -04:00
Thomas Pelletier cf288a51c5 Wip errors reporting 2021-03-30 10:59:35 -04:00