Commit Graph

111 Commits

Author SHA1 Message Date
Thomas Pelletier b371733c67 Make all nodes contain Raw 2022-08-22 21:05:41 -04:00
Thomas Pelletier 64dcce07ea WIP 2022-08-22 23:04:44 +00:00
Thomas Pelletier 9804fc57e0 decoder: support \e escape sequence (#748) 2022-04-07 20:18:30 -04:00
Cameron Moore 128b7a8bfb Decode: check buffer length before parsing simple key (#717)
Fixes #714
2021-12-29 08:58:42 -05:00
Thomas Pelletier 177b4a5e53 Decode: allow \r\n as line whitespace before \ (#709)
Fixes #708
2021-12-26 16:38:15 +01:00
Thomas Pelletier 9bf9be681e Decoder: check for invalid chars in timezone (#695)
Fixes #694
2021-12-02 09:00:20 -05:00
Thomas Pelletier bbaae540ce Decoder: check timezones start with +,-,z,Z (#688)
Also simplifies local time seconds scanning.

Fixes #686
2021-11-30 13:01:15 -05:00
Cameron Moore 2dbd29a565 parser: Fix missing check for upper exponent (#665) 2021-11-09 21:15:23 -05:00
Thomas Pelletier 11f789ef11 Decode: prevent comments that look like dates to be accepted (#657)
* parser: fix date detection

When the parser has to decide between parsing and integer or a date, it should
check that all characters are actually acceptable (digits, or date/time
elements).

Fixes #655
2021-11-04 22:06:12 -04:00
Thomas Pelletier 3dbca20bc9 Decoder: flag invalid carriage returns in strings (#652)
Fixes #651
2021-11-02 10:02:25 -04:00
Thomas Pelletier 39f893ad99 Multiline strings fixes (#643)
* scanner: allow multiline strings to end with "" or ''

* parser: trim all whitespaces after \ in multiline
2021-10-28 18:26:34 -04:00
Thomas Pelletier 4d7c9ddac7 Floats and integers parsing fixes (#638)
* parser: fix scan of float with exp but no decimal
* decoder: validate leading zeros for decimals
2021-10-22 22:25:56 -04:00
Thomas Pelletier 85f5d567e4 parser: validate invalid ASCII control characters 2021-10-16 07:41:12 -04:00
Thomas Pelletier cd54472d03 Validate UTF-8 (#629) 2021-10-15 19:13:21 -04:00
jidicula 86632bc190 parser: fail when missing array separator (#616)
Co-authored-by: Thomas Pelletier <thomas@pelletier.codes>
2021-10-14 08:26:29 -04:00
Cameron Moore 476492a85c unmarshal: support lowercase 'T' and 'Z' in date-time parsing (#601)
RFC3399 allows for lowercase 't' and 'z' in date-time values.

Fixes #600
2021-09-25 10:02:23 -07:00
Thomas Pelletier fa56f48daf parser: don't overflow when parsing bad times (#593)
Fixes #585
2021-09-09 11:59:37 -04:00
Thomas Pelletier a0d685d482 unmarshal: don't crash on unterminated inline table (#587)
Fixes #586
2021-09-07 20:08:59 -04:00
Thomas Pelletier 7e2fa1bc80 unmarshal: fix non-terminated array error
Fixes #581
2021-09-07 10:36:22 -04:00
Thomas Pelletier 40cfb6f458 parser: don't crash on unterminated table key (#580)
* parser: don't crash on unterminated table key

Fixes #579

* parser: fix format of error returned by expect

EOF was missing the format string and %U is not very human friendly.
2021-09-06 12:18:45 -04:00
kkHAIKE 8be357dfa1 Add LocalTime to interface{} decode support (#567)
Co-authored-by: Thomas Pelletier <thomas@pelletier.codes>
2021-07-21 17:50:12 +02:00
kkHAIKE a93b34d984 Unicode parsing optimization (#568)
Inline call to hexToRune and uses specialized parsing, as found in encoding/json.

Co-authored-by: Thomas Pelletier <thomas@pelletier.codes>
2021-07-21 10:50:03 +02:00
Thomas Pelletier 618f0181ac AST Tweaks (#551)
* Use pointers instead of copying around ast.Node

Node is a 56B struct that is constantly in the hot path. Passing nodes
around by copy had a cost that started to add up. This change replaces
them by pointers. Using unsafe pointer arithmetic and converting
sibling/child indexes to relative offsets, it removes the need to carry
around a pointer to the root of the tree. This saves 8B per Node. This
space will be used to store an extra []byte slice to provide contextual
error handling on all nodes, including the ones whose data is different
than the raw input (for example: strings with escaped characters), while
staying under the size of a cache line.

* Remove conditional

* Add Raw to track range in data for parsed values

* Simplify reference tracking
2021-06-03 21:48:51 -04:00
Thomas Pelletier b0d6c62255 Don't use bytes.Buffer when not necessary (#549)
When parsing strings, they can be referenced directly from the document
when they don't contain escaped characters. This avoids paying to cost
of allocating (and sometimes growing) the bytes buffer unecessarily.
2021-06-01 09:51:59 -04:00
Thomas Pelletier c2d1fd86e5 Fix timezone detection when time has fractional component (#544) 2021-05-21 09:37:43 -04:00
Thomas Pelletier 95c701b253 Increase test coverage (#538)
Also fix array in map bug.
2021-05-10 20:17:05 -04:00
Thomas Pelletier ea225df3ed v2: errors (#534)
```
name                              old time/op    new time/op    delta
UnmarshalDataset/config-32          86.7ms ± 2%    87.5ms ± 2%     ~     (p=0.113 n=9+10)
UnmarshalDataset/canada-32           129ms ± 4%     106ms ± 3%  -17.94%  (p=0.000 n=10+10)
UnmarshalDataset/citm_catalog-32    59.4ms ± 5%    58.7ms ± 5%     ~     (p=0.393 n=10+10)
UnmarshalDataset/twitter-32         27.0ms ± 7%    26.9ms ± 6%     ~     (p=0.720 n=10+9)
UnmarshalDataset/code-32             326ms ± 4%     322ms ± 7%     ~     (p=0.661 n=9+10)
UnmarshalDataset/example-32          510µs ±11%     526µs ± 7%     ~     (p=0.182 n=10+9)
UnmarshalSimple-32                  1.41µs ± 6%    1.41µs ± 4%     ~     (p=0.736 n=10+9)
ReferenceFile-32                    45.6µs ± 3%    43.9µs ±10%     ~     (p=0.089 n=10+10)

name                              old speed      new speed      delta
UnmarshalDataset/config-32        12.1MB/s ± 2%  12.0MB/s ± 2%     ~     (p=0.108 n=9+10)
UnmarshalDataset/canada-32        17.1MB/s ± 4%  20.9MB/s ± 3%  +21.86%  (p=0.000 n=10+10)
UnmarshalDataset/citm_catalog-32  9.41MB/s ± 5%  9.51MB/s ± 5%     ~     (p=0.362 n=10+10)
UnmarshalDataset/twitter-32       16.4MB/s ± 8%  16.5MB/s ± 6%     ~     (p=0.704 n=10+9)
UnmarshalDataset/code-32          8.24MB/s ± 4%  8.34MB/s ± 7%     ~     (p=0.675 n=9+10)
UnmarshalDataset/example-32       15.9MB/s ±11%  15.4MB/s ± 7%     ~     (p=0.182 n=10+9)
ReferenceFile-32                   115MB/s ± 4%   120MB/s ±10%     ~     (p=0.085 n=10+10)

name                              old alloc/op   new alloc/op   delta
UnmarshalDataset/config-32          16.9MB ± 0%    16.9MB ± 0%   -0.02%  (p=0.000 n=10+10)
UnmarshalDataset/canada-32          76.8MB ± 0%    74.3MB ± 0%   -3.31%  (p=0.000 n=10+10)
UnmarshalDataset/citm_catalog-32    37.3MB ± 0%    37.1MB ± 0%   -0.60%  (p=0.000 n=9+10)
UnmarshalDataset/twitter-32         15.6MB ± 0%    15.6MB ± 0%   -0.09%  (p=0.000 n=10+10)
UnmarshalDataset/code-32            60.2MB ± 0%    59.3MB ± 0%   -1.51%  (p=0.000 n=10+9)
UnmarshalDataset/example-32          238kB ± 0%     238kB ± 0%   -0.18%  (p=0.000 n=10+10)
ReferenceFile-32                    11.8kB ± 0%    11.8kB ± 0%     ~     (all equal)

name                              old allocs/op  new allocs/op  delta
UnmarshalDataset/config-32            653k ± 0%      645k ± 0%   -1.20%  (p=0.000 n=10+6)
UnmarshalDataset/canada-32           1.01M ± 0%     0.90M ± 0%  -11.04%  (p=0.000 n=9+10)
UnmarshalDataset/citm_catalog-32      384k ± 0%      370k ± 0%   -3.75%  (p=0.000 n=10+10)
UnmarshalDataset/twitter-32           160k ± 0%      157k ± 0%   -1.32%  (p=0.000 n=10+10)
UnmarshalDataset/code-32             2.97M ± 0%     2.91M ± 0%   -2.15%  (p=0.000 n=10+7)
UnmarshalDataset/example-32          3.69k ± 0%     3.63k ± 0%   -1.52%  (p=0.000 n=10+10)
ReferenceFile-32                       253 ± 0%       253 ± 0%     ~     (all equal)
```
2021-05-08 16:04:25 -04:00
Vincent Serpoul 3f2bb0b363 golangci-lint (#530) 2021-05-06 22:29:21 -04:00
Vincent Serpoul 201d5dd422 golangci-lint: misc (#529) 2021-04-27 20:29:00 -04:00
Thomas Pelletier 1e80267558 parser: require \n after parsing integer in kv (#527)
Fixes #526
2021-04-24 09:57:21 -04:00
Vincent Serpoul 2b1c52dddd golangci-lint: decoder/unmarshal (#518) 2021-04-22 09:29:23 -04:00
Thomas Pelletier 37714006b6 V2 Marshaler MVP (#495) 2021-04-08 10:07:29 -04:00
Thomas Pelletier 32da85ab11 Decoding error position tracking 2021-03-30 21:43:57 -04:00
Thomas Pelletier 51d78a5f0c Fix unmarshaling of literal keys
Ref #427.
2021-03-29 20:58:51 -04:00
Cameron Moore 7d8ea80dc3 Fix scanning of float with leading zero (#486) 2021-03-29 20:07:26 -04:00
Thomas Pelletier 829c005784 Fix unicode decoding 2021-03-28 11:03:43 -04:00
Thomas Pelletier b24eb93e8e Fix literal multiline parsing 2021-03-28 00:23:50 -04:00
Thomas Pelletier 7dc5550057 Fix multiline basic string parsing 2021-03-28 00:17:58 -04:00
Thomas Pelletier 72c999ecbf Fix trailing commas in arrays 2021-03-28 00:04:25 -04:00
Thomas Pelletier 636a75f316 Import tomltestgen
Handful are failing.
2021-03-26 09:51:35 -04:00
Thomas Pelletier 390927a0cd Reuse AST storage between top-level expressions
```
Comparing:
	old: v2-wip/1da2fc7 (2021-03-25 20:38:05 -0400 -0400)
	run: v2-wip/3f23ab9 (2021-03-25 22:35:06 -0400 -0400)
-----------------------------------------------------------
name                  old time/op    new time/op    delta
UnmarshalSimple/v2-8     700ns ± 3%     705ns ± 2%     ~     (p=0.690 n=5+5)
UnmarshalSimple/v1-8    3.85µs ± 1%    4.02µs ± 4%   +4.19%  (p=0.032 n=5+5)
UnmarshalSimple/bs-8    2.34µs ± 2%    2.38µs ± 3%     ~     (p=0.310 n=5+5)
ReferenceFile/v2-8      32.2µs ±13%    23.9µs ± 1%  -25.79%  (p=0.008 n=5+5)
ReferenceFile/v1-8       270µs ± 2%     264µs ± 2%     ~     (p=0.095 n=5+5)
ReferenceFile/bs-8       291µs ± 0%     294µs ± 0%   +0.88%  (p=0.008 n=5+5)

name                  old alloc/op   new alloc/op   delta
ReferenceFile/v2-8      37.1kB ± 0%     6.7kB ± 0%  -81.91%  (p=0.008 n=5+5)
ReferenceFile/v1-8       131kB ± 0%     131kB ± 0%     ~     (p=0.444 n=5+5)
ReferenceFile/bs-8      80.8kB ± 0%    80.8kB ± 0%     ~     (p=0.571 n=5+5)

name                  old allocs/op  new allocs/op  delta
ReferenceFile/v2-8         152 ± 0%       148 ± 0%   -2.63%  (p=0.008 n=5+5)
ReferenceFile/v1-8       2.65k ± 0%     2.65k ± 0%     ~     (all equal)
ReferenceFile/bs-8       1.73k ± 0%     1.73k ± 0%     ~     (all equal)

~/s/g/p/g/benchmark$ go test -bench=.
goos: linux
goarch: amd64
pkg: github.com/pelletier/go-toml/v2/benchmark
cpu: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz
BenchmarkUnmarshalSimple/v2-8         	 1692444	       710.7 ns/op
BenchmarkUnmarshalSimple/v1-8         	  307609	      3862 ns/op
BenchmarkUnmarshalSimple/bs-8         	  520429	      2285 ns/op
BenchmarkReferenceFile/v2-8           	   50395	     24006 ns/op	    6704 B/op	     148 allocs/op
BenchmarkReferenceFile/v1-8           	    4144	    264655 ns/op	  130567 B/op	    2649 allocs/op
BenchmarkReferenceFile/bs-8           	    3969	    293635 ns/op	   80784 B/op	    1729 allocs/op
PASS
ok  	github.com/pelletier/go-toml/v2/benchmark	8.143s
```
2021-03-25 22:37:16 -04:00
Thomas Pelletier 1bae751a45 Linear array storage for AST 2021-03-25 19:56:02 -04:00
Thomas Pelletier e78ccff9a4 Fix parsing integer 0 2021-03-23 09:02:48 -04:00
Thomas Pelletier fcc91f2618 Progress on date/times 2021-03-22 09:59:15 -04:00
Thomas Pelletier f9f9ccb777 Basic array table implementation 2021-03-16 10:24:19 -04:00
Thomas Pelletier c6892fcf5a wip array table 2021-03-15 19:35:48 -04:00
Thomas Pelletier 00b2f776a9 Replace branch with AST version 2021-03-15 08:46:35 -04:00
Thomas Pelletier 93a74fca35 todo: inline tables 2021-03-08 21:59:43 -05:00
Thomas Pelletier bf051f1718 Fixed some tests 2021-03-01 20:50:18 -05:00
Thomas Pelletier 9ac08febd2 DateTime/LocalDate/LocalTime implementation 2021-02-10 20:58:22 -05:00